In this discussion (Slow Bulk Insert - #8 by kayngee) I found that:
As for the merge policy, tuning it for more segments will trade some search performance for indexing performance. But increasing the floor_segment size is going to create more concurrent merging, especially coupled with higher max_merge_at_once* settings. So I'd only increase segments_per_tier.
This conflicts with my understanding of floor_segment (Merge | Elasticsearch Guide [1.6] | Elastic):
index.merge.policy.floor_segment:
Segments smaller than this are "rounded up" to this size, i.e. treated as
equal (floor) size for merge selection. This is to prevent frequent
flushing of tiny segments, thus preventing a long tail in the index. Default
is 2mb.
My reading of this is that a higher value here would reduce the likelihood of merging because of less flushing, but it is clear from the post above that my understanding could be rather backwards. Can anyone help clarify how floor_segment is related to the merge rate: when floor_segment increases or decreases, what is likely to happen to the merge frequency?
Thanks!