Merge Bottleneck

It seems we are running into an application bottleneck during merge operations. We are running Elasticsearch 1.5.2 using the default tiered policy although we have done testing at several different ES versions with results being similar. Per the docs the index.merge.scheduler.max_thread_count should default to the max of 1 or minimum of 3 or available processors / 2. On a 8 core machine therefore we would expect to see three threads dedicated to merging once they are needed. We keep active metrics on batch indexing time and notice considerable increases during merge operations. During these elevated indexing times a dump of hot_threads shows merge operations at the top of the list on the node. Running a tool such as nmon shows one core of our CPU completely pegged while the other CPUs are running idle. To us this indicates some type of blocking action. Again, the docs indicate that disk IO could be a potential bottleneck during merges but from observations we take during the merge operations we found that we are coming nowhere close to the disks full IO capacity.

We have had some success in improving indexing bandwidth by dropping the index.merge.policy.max_merged_segment from 5gb (default) to 1gb. This obviously has the effect of increasing the number of segments and can impact query operations. Drilling down into the thread trace we have evidence that perhaps the compression operation could be blocking.

<code> 100.0% (499.8ms out of 500ms) cpu usage by thread 'elasticsearch[xxx-search101][[2015-06-10t00:00:00.000z][9]: Lucene Merge Thread #515]'
 2/10 snapshots sharing following 13 elements
   org.apache.lucene.codecs.compressing.LZ4.encodeSequence(LZ4.java:170)
   org.apache.lucene.codecs.compressing.LZ4.compress(LZ4.java:243)
   org.apache.lucene.codecs.compressing.CompressionMode$LZ4FastCompressor.compress(CompressionMode.java:161)
   org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.flush(CompressingStoredFieldsWriter.java:236)

Is it possible that the compression component of the merge could be blocking? Watching CPU and Disk resources closely during indexing operations shows we are not close to saturation for either. Thoughts or suggestions would be appreciated.

This likely means only one large merge was necessary at the time, i.e. execution of a single merge is single threaded.

If, while that merge was still running, another large merge became necessary, then you would see a 2nd merge thread running concurrently.

Those LZ4 compression APIs are thread-private, so one merge doing compression would not block another merge.

Mike