How to predict or estimate merge will be triggered

Is there a way to predict or estimate merge will be triggered during indexing time?
I use Elasticsearch 1.7.2 and all settings about merge is by default.

Not effectively.

There is a way to avoid merges. That is by setting refresh_interval to -1. That will suspend automatic merging, and making your freshly indexed documents visible. If you're doing a bulk data load, that is a good way to speed up your indexing.

There is also a way to optimize / force merge afterwards, but in practice that should really be used only when you're pretty sure you're done modifying the index and it is now historical data.

Hope that helps

Refresh interval won't always prevent refreshes. If you run out of memory
budgeted to keep documents in memory before a refresh elasticsearch will
trigger a refresh.

Refreshes aren't merges either, though you are right in that not refreshing
won't create the segments which have to later be merged.

You can control merges somewhat with merge throttling and the size of the
merge thread pool but you can't really predict or prevent them. At, least
not in any sustainable, practical way.

1 Like