Throttle merges


(Sergey Novikov) #1

I'm having issues with heavy CPU usage and I/O after indexing. Long after indexing has finished, load average is still high and CPU is loaded (50-100%), this makes search operations slow.

I've set index.merge.scheduler.max_thread_count to 1, but still see many threads per node (GET _nodes/hot_threads):

 100.2% (500.7ms out of 500ms) cpu usage by thread 'elasticsearch[testing-195146d5][[testindex][9]: Lucene Merge Thread #49]'
   3/10 snapshots sharing following 14 elements
     org.apache.lucene.codecs.DocValuesConsumer$3$1.hasNext(DocValuesConsumer.java:316)
     org.apache.lucene.codecs.DocValuesConsumer$10$1.hasNext(DocValuesConsumer.java:855)
     ...
     org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)

64.7% (323.3ms out of 500ms) cpu usage by thread 'elasticsearch[testing-195146d5][[testindex][5]: Lucene Merge Thread #35]'
 10/10 snapshots sharing following 9 elements
   org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addSortedNumericField(PerFieldDocValuesFormat.java:122)
   org.apache.lucene.codecs.DocValuesConsumer.mergeSortedNumericField(DocValuesConsumer.java:301)
   org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:223)
   org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:122)
   org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4223)
   org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3811)
   org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:409)
   org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
   org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)

52.6% (262.9ms out of 500ms) cpu usage by thread 'elasticsearch[testing-195146d5][[testindex][1]: Lucene Merge Thread #46]'
 2/10 snapshots sharing following 17 elements
   org.apache.lucene.index.SingletonSortedNumericDocValues.setDocument(SingletonSortedNumericDocValues.java:52)
   org.apache.lucene.codecs.DocValuesConsumer$3$1.setNext(DocValuesConsumer.java:353)
   org.apache.lucene.codecs.DocValuesConsumer$3$1.hasNext(DocValuesConsumer.java:316)
...
   org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)

99.1% (495.6ms out of 500ms) cpu usage by thread 'elasticsearch[testing-195146d5][[testindex][0]: Lucene Merge Thread #47]'
 2/10 snapshots sharing following 13 elements
   org.apache.lucene.codecs.DocValuesConsumer$3$1.setNext(DocValuesConsumer.java:359)
   org.apache.lucene.codecs.DocValuesConsumer$3$1.hasNext(DocValuesConsumer.java:316)
       

Index section in my configuration looks like this:

index:
  auto_expand_replicas: 0-all
  merge:
    scheduler:
      max_thread_count: 1
...
indices:
  store:
    throttle:
      max_bytes_per_sec: 10mb
      type: merge

there are messages in log:

now throttling indexing: numMergesInFlight=4, maxNumMerges=3

By the way, merge problems started after we moved some of the data to doc_values.

Do I misunderstand the setting? Looks like it ensures only 1 merge thread will be running.
What's the difference with index.merge.policy.max_merge_at_once, should I set it instead?

What I want is to make merge less aggressive.


(Srinath C) #2

See the discussion in Index throttling issue.
We faced a similar issue after using doc_values and we resolved it by removing merge policy and index throttle settings.
We are still facing increased disk utilization though.


(Sergey Novikov) #3

hm, but it seems exactly the opposite of what I'm trying to achieve? I.e. I need to throttle merges, otherwise they take all the resources and make any other operations slow.

(with doc_values disk utilisation is increased for us as well, but no heap problems anymore)


(Mark Walkom) #4

There's always a cost, so you just need to balance those accordingly.


(Sergey Novikov) #5

After some experiments, it turns out index.merge.scheduler.max_thread_count = 1 doesn't mean 1 thread (more like 4 in my case), but still reduces load. I ran 50000 partial updates in 50 threads

without throttling:


and with max_thread_count = 1


(system) #6