Optimize elasticsearch segment merge

quweiyu · April 3, 2020, 5:31am

All,

We are running an elasticsearch load (48K documents / mins with elasticsearch 6.5) for 24 hours. There are 32 active main shards for each 8 hours (no replica shard). Here is what we see,

Load runs well at the beginning.
About 6 hours later, there are lots of disk I/O which cause _bulk request big delay. Then, lots of documents failed to write to Elasticsearch.

We believe the high disk I/O occupied most time slice which caused by segment merge. We did something to reduce the Elasticsearch merge as below,

index.merge.policy.floor_segment : 8M (default is 2M)
index.merge.policy.segments_per_tier : 15 (default is 10)
index.merge.policy.max_merged_segment: 1G (default is 5G)
index.merge.scheduler.max_thread_count :1 (default is 3)
refresh_interval = 120s
index.translog.durability: async
index.translog.sync_interval: 120s
4 data path for each data node.

After that, take one of index for example, we can see “disk amplification” is 1.95.

  "store" : {
    "size_in_bytes" : 20333529464
  },
   "merges" : {
    "current" : 0,
    "current_docs" : 0,
    "current_size_in_bytes" : 0,
    "total" : 149,
    "total_time_in_millis" : 9815076,
    "total_docs" : 17613481,
    "total_size_in_bytes" : 39838526820,
    "total_stopped_time_in_millis" : 0,
    "total_throttled_time_in_millis" : 6424989,
    "total_auto_throttle_in_bytes" : 20971520
  }

My question is,
Is there anything we can do to optimize segment merge and to save some disk I/O for _bulk request besides I listed ?

Thanks much,
Jill

Christian_Dahlqvist · April 3, 2020, 6:58am

It looks like you are using slow storage as there is a lot of throttling. Indexing is an I/O intensive process and I am not sure how much trying to tune merging will give you. I would recommend watching this video and try to upgrade to faster storage.

quweiyu · April 3, 2020, 9:00am

Thanks much for your info. Actually, hardware disk is out of our control for now. Let's assume storage cannot be changed. With this condition, is there anything we can to optimize the segment merge?

Christian_Dahlqvist · April 3, 2020, 9:17am

It seems you have done most of what I would expect. I am unsure you would see any major gains from further tuning.

quweiyu · April 3, 2020, 11:54am

Thanks anyway. Since we are using cloud resources, hardware is out of our limits.

system · May 1, 2020, 11:55am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Merging too heavy - causing very slow response times Elasticsearch	4	676	July 6, 2017
Reduce Number of Segments Elasticsearch	8	1417	July 6, 2017
ElasticSearch(0.90) How to make big segment at first place Elasticsearch	2	350	July 6, 2017
Adjust index.merge.policy.max_merged_segment value to lower result? Elasticsearch	8	5997	July 5, 2017
Huge IO usage while segment merging Elasticsearch	8	2290	December 7, 2016

Optimize elasticsearch segment merge

Related topics