I'm having issues with heavy CPU usage and I/O after indexing. Long after indexing has finished, load average is still high and CPU is loaded (50-100%), this makes search operations slow.
I've set index.merge.scheduler.max_thread_count
to 1, but still see many threads per node (GET _nodes/hot_threads
):
100.2% (500.7ms out of 500ms) cpu usage by thread 'elasticsearch[testing-195146d5][[testindex][9]: Lucene Merge Thread #49]'
3/10 snapshots sharing following 14 elements
org.apache.lucene.codecs.DocValuesConsumer$3$1.hasNext(DocValuesConsumer.java:316)
org.apache.lucene.codecs.DocValuesConsumer$10$1.hasNext(DocValuesConsumer.java:855)
...
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)
64.7% (323.3ms out of 500ms) cpu usage by thread 'elasticsearch[testing-195146d5][[testindex][5]: Lucene Merge Thread #35]'
10/10 snapshots sharing following 9 elements
org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addSortedNumericField(PerFieldDocValuesFormat.java:122)
org.apache.lucene.codecs.DocValuesConsumer.mergeSortedNumericField(DocValuesConsumer.java:301)
org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:223)
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:122)
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4223)
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3811)
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:409)
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)
52.6% (262.9ms out of 500ms) cpu usage by thread 'elasticsearch[testing-195146d5][[testindex][1]: Lucene Merge Thread #46]'
2/10 snapshots sharing following 17 elements
org.apache.lucene.index.SingletonSortedNumericDocValues.setDocument(SingletonSortedNumericDocValues.java:52)
org.apache.lucene.codecs.DocValuesConsumer$3$1.setNext(DocValuesConsumer.java:353)
org.apache.lucene.codecs.DocValuesConsumer$3$1.hasNext(DocValuesConsumer.java:316)
...
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)
99.1% (495.6ms out of 500ms) cpu usage by thread 'elasticsearch[testing-195146d5][[testindex][0]: Lucene Merge Thread #47]'
2/10 snapshots sharing following 13 elements
org.apache.lucene.codecs.DocValuesConsumer$3$1.setNext(DocValuesConsumer.java:359)
org.apache.lucene.codecs.DocValuesConsumer$3$1.hasNext(DocValuesConsumer.java:316)
Index section in my configuration looks like this:
index:
auto_expand_replicas: 0-all
merge:
scheduler:
max_thread_count: 1
...
indices:
store:
throttle:
max_bytes_per_sec: 10mb
type: merge
there are messages in log:
now throttling indexing: numMergesInFlight=4, maxNumMerges=3
By the way, merge problems started after we moved some of the data to doc_values
.
Do I misunderstand the setting? Looks like it ensures only 1 merge thread will be running.
What's the difference with index.merge.policy.max_merge_at_once
, should I set it instead?
What I want is to make merge less aggressive.