As per the elasticsearch documentation the segment merging is background process that keeps running. So my question is that in what cases is this segment merging insufficient so that I need to run forcemerge via the API ?
We have been running forcemerge in our ES cluster periodically since sometime and we see about 10-20% reduction (from 50 GB to 40GB) in our shard size after running forcemerge.
Also is it recommended to run forcemerge on indices which are being actively written to?
No. It's not recommended.
Can you let me know the reason behind it ?
You'll end up using much more IO than needed IMO. Elasticsearch/Lucene have good default values. I'd not really call this API.
You only want to use _forcemerge on indices which are no longer being written to, for reasons @dadoonet mentions.
The segments add overhead at both the Elasticsearch and the OS level (excessive file handles, etc.). While merges will run as data is being written, it is not to the level that can be achieved once an index is stable (no more writes). I have seen clusters that were seeing a lot of shard read errors, which were "fixed" by running _forcemerge on the older indices, freeing resources.
So if you are writing daily, weekly or monthly indices, it is a good idea to issue a _forcemerge after they are stable. This can be automated using curator.
NOTE: Keep in mind that _forcemerge will generate a significant burst of writes that might last for a few seconds or many minutes depending on the size of the index. You must have some write IOPS headroom under normal load or a _forcemerge will negatively impact the clusters ability to keep up with the ingestion of new data. That said, if you are too close to your IOPS limits to run _forcemerge, you probably are in need of adding some nodes, or migrating to better storage anyway.
Thanks for your response @rcowart. Also is there a case when I would need to use focemerge on an index that is not being written to ?
I believe the background merge job run by ES does the same job as _forcemerge. So in what cases _forcemerge provides benefits as compared to the normal background merge process.
With forcemerge you can down to 1 unique segment which will not happen with "normal" merge.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.