Indexing Performance drop after a few hours

stretchy · December 30, 2021, 6:35pm

Hi,
I'm having an issue with the Elasticsearch after a few hours (2-3 hours). There is a massive performance drop (50%-60%) but no symptoms as to why.

Setup:
Three nodes: 1 master two data nodes
Elasticsearch version: 7.12.0
Configuration: All recommended settings are in place (e.g., ulimits)
Purpose: Heavy Indexing Operation

CPU/Memory/Heap/File Descriptors/Thread count/Hot Threads/merge/delete/refresh are normal (no increase). So my application runs fine if I disable Elasticsearch. I also tried various heap sizes/node memory/refresh intervals, but there is no effect, and nothing in the log files (log level info)

I also looked at the wire statistics using Wireshark. When the performance drops, it seems it takes longer for the response to come back (http.time) but there is no change in RTT between the nodes.

PS: I'm new to Elasticsearch.

Any help is appreciated.
Thanks

Christian_Dahlqvist · December 30, 2021, 6:51pm

Having a single master eligible node is bad as it makes that node a single point of failure. I would recommend making the data nodes master eligible as well for improved resilience.

Some additional information would be useful:

How many indices and shards are you actively indexing into?
How many concurrent indexing porocesses/threads are you using?
What is the bulk size and average size of the indexed documents?
Are you assigning document IDs before indexing or allowing Elasticsearch to define the IDs?

Indexing in Elasticsearch can often be very I/O intensive. What type of hardware are you using? What type of storage are you using? Local SSDs?

stretchy · December 30, 2021, 7:08pm

Having a single master eligible node is bad as it makes that node a single point of failure. I would recommend making the data nodes master eligible as well for improved resilience.

Both data nodes are eligible to be the master node

How many indices and shards are you actively indexing into?

Three indices (two heavy indices one light index). Two shards/index (primary and replica)

How many concurrent indexing processes/threads are you using?

active threads 1-3/node

What is the bulk size and average size of the indexed documents?

about 400KB and 4.3K index requests

Are you assigning document IDs before indexing or allowing Elasticsearch to define the IDs?

No, IDs are assigned by Elasticsearch

AWS SSDs. The disk usage shows no anomalous activity

Christian_Dahlqvist · December 30, 2021, 7:16pm

What type of instances are you using? Exactly what type of AWS SSD storage are you using? Are you monitoring iowait on the data nodes?

stretchy · December 30, 2021, 8:23pm

GP2 storage and nvme drives. There is nothing strange in the iowait for the data node.

Another point is, if fewer index requests are sent then it just takes longer to reach the same performance drop

Christian_Dahlqvist · December 30, 2021, 8:48pm

What are the data nodes using? If gp2, what is the size of the volumes?

stretchy · December 30, 2021, 8:54pm

GP2 100GB/node

m5 2xlarge

Christian_Dahlqvist · December 30, 2021, 9:10pm

gp2 EBS as far as I recall get IOPS allocated based on size. I think it is 3 IOPS per GB so I believe your disks would only support 300 IOPS, which is not a lot. This could very well be the limiting factor once indices grow and larger, I/O intensive merges take place. You may want rto test upgrading or increasing the size of your storage to see if that makes a difference.

stretchy · December 30, 2021, 9:16pm

Thank you very much. I will upgrade the size and see if that makes any difference.

system · January 27, 2022, 9:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch poor indexing performance Elasticsearch	6	885	December 1, 2017
How to diagnose when indexing performance suddenly droped Elasticsearch	3	555	June 11, 2020
Indexing rate drops periodically - performance troubleshooting Elasticsearch	2	312	November 18, 2020
How does indexing performance vary over increase in number of nodes? Elasticsearch	10	2040	July 5, 2017
Indexing slows down dramatically as index size grows Elasticsearch	4	549	July 6, 2017

Indexing Performance drop after a few hours

Related topics