Write slow on elastic cluster

Sunil_Chand · June 22, 2021, 8:47am

Hi All

One of clusters has the following specs
8 nodes (8 data and 5 master eligible) each with 15GB heap space running on servers with 32GB of RAM.
Elasticsearch version 7.6.2 on linux
each node has 8 TB of space allocated
around 30 billion documents are there
Each node has shards of 1443
total write i/o is around 350/sec
The application team is complaining the writes are slow, how can we improve writes to cluster?

Thanks
Sunil

Christian_Dahlqvist · June 22, 2021, 9:09am

That looks like a lot of shards per node. Significantly more than recommended. How many indices and shards are you actively indexing into? How are you indexing into Elasticsearch? What type of storage do you have?

Sunil_Chand · June 22, 2021, 12:55pm

Hi Christian

storage is XIO Flash storage
ingestion happening from a python script, any given time there are 34 processes that are capable of writing.
each process will only write to a single index at a time, index refresh is disabled when they start on a particular work item
on completion of the work item they trigger a manual refresh

the volume of 2.2k items per day will be roughly ingested, which isn't 1:1 with indexes (As several of these items share an index)

Thanks
Sunil

Christian_Dahlqvist · June 22, 2021, 12:59pm

What bulk size are you using?

Sunil_Chand · June 22, 2021, 2:20pm

Hi Christian

500 events/chunk, max size of 50MB , 4 queues

Thanks
Sunil

Christian_Dahlqvist · June 22, 2021, 3:09pm

What is the average size of your events? Are you using dynamic mappings?

warkolm · June 22, 2021, 10:55pm

That's too many, you need to at least halve that.

Sunil_Chand · June 23, 2021, 1:25pm

Hi Christian

average size of event is around 3kb and dynamic sampling is being used

Thanks
Sunil

Christian_Dahlqvist · June 23, 2021, 1:28pm

What unit is this? What ingest rate are you seeing?

Sunil_Chand · June 23, 2021, 1:29pm

Hi warkolm

do we need to add additional nodes? or increase in memory? our infrastructure team has said even half of the given memory is not getting allocated.

Thanks
Sunil

Sunil_Chand · June 23, 2021, 2:11pm

Hi Christian

i/o statistics captured from kibana console and ingestion rate is 1.2 k events per second

Thanks
Sunil

Christian_Dahlqvist · June 23, 2021, 4:18pm

How have you determined that it is the Elasticsearch cluster that is the bottleneck and not the Python scripts you are using to ingest the data?

system · July 21, 2021, 4:19pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elastic Cluster Went Down Elasticsearch	5	629	September 19, 2019
Elasticsearch 8.9.1 indexing bottleneck on i3.2xlarge and d3.2xlarge nodes in EKS using ECK Elasticsearch	11	841	October 30, 2023
Extremely slow writing to replicas Elasticsearch	6	2021	August 30, 2018
Elasticsearch querying is terribly slow Elasticsearch	11	20040	May 19, 2017
Poor write performance Elasticsearch	12	752	November 26, 2018

Write slow on elastic cluster

Related topics