I'm performing bulk indexing in a scheduled manner and right before each one I disable refresh_interval to avoid throttling.
It used to work perfectly but then I upgraded to 1.6.0 (not 100% sure it caused it) and performance decreased from 4000 op/s to 250 op/s until Merge Thread finishes and then it returns back to 4000 op/s.
So my question is why I see Segment Merge Thread in _nodes/hot_threads keeping CPU busy even if I disabled refresh_interval for the index? It starts merging right after indexing and stops after 30 minutes if the bulk indexing doesn't stop sooner. Thanks !
I have 2 nodes (index has 5 shards) with one replica. Bulk sizes are optimal, index is not full, enough heap/off-heap RAM
24.6% (122.8ms out of 500ms) cpu usage by thread 'elasticsearch[Blacklash][[mi_2015_08_w4][0]: Lucene Merge Thread #86911]'
2/10 snapshots sharing following 19 elements
java.lang.Thread.sleep(Native Method)
java.lang.Thread.sleep(Thread.java:340)
org.apache.lucene.store.RateLimiter$SimpleRateLimiter.pause(RateLimiter.java:151)
org.apache.lucene.store.RateLimitedFSDirectory$RateLimiterWrapper.pause(RateLimitedFSDirectory.java:96)
org.apache.lucene.store.RateLimitedIndexOutput.checkRate(RateLimitedIndexOutput.java:76)
org.apache.lucene.store.RateLimitedIndexOutput.writeBytes(RateLimitedIndexOutput.java:70)
org.apache.lucene.codecs.lucene410.Lucene410DocValuesConsumer.addBinaryField(Lucene410DocValuesConsumer.java:284)
org.apache.lucene.codecs.lucene410.Lucene410DocValuesConsumer.addTermsDict(Lucene410DocValuesConsumer.java:336)
org.apache.lucene.codecs.lucene410.Lucene410DocValuesConsumer.addSortedField(Lucene410DocValuesConsumer.java:471)
org.apache.lucene.codecs.lucene410.Lucene410DocValuesConsumer.addSortedSetField(Lucene410DocValuesConsumer.java:500)
org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addSortedSetField(PerFieldDocValuesFormat.java:127)
org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:626)
org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:213)
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:122)
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4223)
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3811)
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:409)
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)
2/10 snapshots sharing following 11 elements
org.apache.lucene.store.OutputStreamIndexOutput.writeBytes(OutputStreamIndexOutput.java:51)
org.apache.lucene.store.RateLimitedIndexOutput.writeBytes(RateLimitedIndexOutput.java:71)
org.apache.lucene.store.CompoundFileWriter$DirectCFSIndexOutput.writeBytes(CompoundFileWriter.java:356)
org.apache.lucene.store.DataOutput.copyBytes(DataOutput.java:281)
org.apache.lucene.store.Directory.copy(Directory.java:194)
org.apache.lucene.index.IndexWriter.createCompoundFile(IndexWriter.java:4785)
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4266)
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3811)
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:409)
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)
Now I'm even disabling throttling entirely before indexing :
"indices.store.throttle.type" : "none"
But I still see this in _nodes/hot_threads for the node. So neither disabling refresh_interval nor disabling throttling prevents ES from merging segments :-/
49.6% (248ms out of 500ms) cpu usage by thread 'elasticsearch[Blacklash][[mi_2015_08_w4][4]: Lucene Merge Thread #88922]'
10/10 snapshots sharing following 4 elements
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3811)
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:409)
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)
Indeed Elasticsearch needs to merge segments regardless of whether you refresh or not. When you add data to a Lucene index, it needs to invert the data (to build the inverted index) in memory. This in-memory buffer is merged to a segment not only when there is a refresh happening but also when this buffer hits a maximum size. And when there are many segments, elasticsearch/lucene need to merge them.
Actually, Elasticsearch also needs the indexing buffer for doc values, just for different reasons. It's actually "normal" that a node spends CPU time merging if there is steady indexing.
Earlier versions of elasticsearch had completely asynchronous indexing and refresh, which used to cause issues because the number of segments could grow unbounded so this was fixed in order to make sure that indexing keeps up with merges, I'm wondering that this might be the cause of the issue you are seeing. Which version did you upgrade from and how many fields do you have?
We used 1.5.0 before. The document contains ~ 20 fields. 25 characters on average, one of them has 800 chars. I was playing with bulk sizes from 90 to 500 but it doesn't seem to have effect on this issue.
Now I am using ES 2.1. Still have the same problem. Indexing speed will drop down to hundreds per seconds and then back. But the strange thing is that it happens from one of the day, before it , I didn't observe anything like this.
We are using 1.7 and noticed a similar problem: The indexing getting throttled even when the IO was not saturated (Fusion IO in our case).
After increasing "index.merge.scheduler.max_thread_count" the problem seems to be gone -- we did not see any more throttling being logged so far.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.