How to determine max_num_segments for force merge?


(Anh) #1

Hi All,

In our ES cluster, we have time based indexes from a few GB up to 100+GB (replicas not counted). The cluster uses hot cold model, and I'm looking at force merge (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-forcemerge.html) to see if I could optimize the indexes on cold nodes for searching.

I haven't found any guidelines on setting the max_num_segments for force merge. For instance, we have weekly indexes at about 120GB each with 4 primary shards, how to deternmine the proper max_num_segments when running force merge on those indexes?


(Daniel Mitterdorfer) #2

Hi @anhlqn,

for your use case the optimal number is 1. Force merging is very I/O intensive but if you don't write to this index after force merge (and that's the case here with hot/cold), then Lucene needs to search in only one segment and you also need to do the force merge once.

Daniel


(Anh) #3

Thanks @danielmitterdorfer. From one of the videos in ElasticON 2016, a support engineer recommended that the size of each shard should be under 50 GB. Would it also be the recommended size for a Lucene segment?


(Mark Walkom) #4

No, that is not the case.
There isn't really a recommended segment size, just a shard one.


(Anh) #5

Got it, thanks


(system) #6