High indexing ES cluster sawtooth pattern

j_wulf_j · March 23, 2017, 1:29am

Hello folks,
I have a classic ELK cluster for storing logs from my applications.

Elastic search:
version: 5.1.2
3 nodes 8Gb of RAM each box (6Gb dedicated to ES)
Running on AWS gp2 provisioned 3000 IOps hard drives.
mlockall enabled

Logstash:
version: 5.0.0 (5.2.2 gives the same results)
1 instance with default configuration.

Recently I realized that we can't catch up with the amount of data we are pushing into ES.
After x-pack installation I found out that I have a sawtooth pattern in the "Indexing rate" section.
It looks like the whole box suspends completely every 90 seconds.
I increased memory allocation from 4Gb to 6Gb which didn't help a lot.

The weird thing is that I can see the same behavior when I put ES off the load.
And these gaps are in all charts which look so weird.
I see two possible scenarios either ES box freezes periodically either X-Pack just messes monitoring packets.
Personally, it seems impossible to fall JVM heap to zero because of GC.

Any thoughts?

Under load
Overview

Master node

Off the load
Overview

Master node

dadoonet · March 23, 2017, 6:24am

Can you see anything in logs?

Some comments:

not more than 50% of the RAM allocated to the HEAP.
Use machines with internal SSD drives

Christian_Dahlqvist · March 23, 2017, 6:52am

What does the advanced node screen show with respect to GC? What is the average size of your documents?

j_wulf_j · March 23, 2017, 5:47pm

Logs are completely clear. indexing_slowlog and index_search_slowlog are empty.
Unfortunately, I can't use internal SSD drives. We are running on AWS and switching is not the case.

My average document size is 724 bytes.
Daily index size is about 60-75 Gb.
Daily docs is about 72-100 mln.

j_wulf_j · March 23, 2017, 6:07pm

Here are advanced tabs
Master node

Secondary node

dadoonet · March 23, 2017, 6:53pm

You are probably going to pay the price with segment merging.

You didn't disable refresh right?

j_wulf_j · March 23, 2017, 7:46pm

No, I didn't.

Here is my active index monitoring

It does segment merging, but I don't see any big merges related to these gaps.

Christian_Dahlqvist · March 24, 2017, 6:31am

Based on your description it sounds like you might be running on m4.large nodes. If that is the case, these only have 2 CPU cores and moderate networking performance, both of which could be limiting indexing throughput. When I look at your monitoring graphs, it looks like CPU is pegged at 100% for periods of time and the heap usage is quite high (especially given that we recommend a 4GB heap on a 8GB host), resulting in a shallow saw-tooth pattern. Given the size of the documents and complexity of the mappings in use, it is possible that you have reached the limit of what your cluster can handle and actually may need to upgrade to a larger insurance type that provide more CPU and RAM as well as better networking performance.

j_wulf_j · March 24, 2017, 7:19pm

Yes, we are running on m4.large boxes.
It is very possible.
I will give it another try on bigger boxes.
Thank you.

system · April 21, 2017, 7:19pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES 7.4 GC keeps reclaiming less memory on each pass Elasticsearch	9	530	May 24, 2020
Elasticsearch heap issues Elasticsearch	4	438	July 5, 2017
Performance weird stuff Elasticsearch	13	874	September 25, 2020
Cluster from virtual machines Elasticsearch	5	770	July 5, 2017
Performance degrading after a couple of weeks Elasticsearch	7	521	October 30, 2018

High indexing ES cluster sawtooth pattern

Related topics