Here is my strange story :
We had a cluster with 5 nodes with one node having load.
We have added 2 new nodes. After few days we have deleted 2 old ones because they had to much load, and we finally have added 2 new powerful machines. Finally there is 7 nodes in the cluster.
Since all this changes all our nodes are in load Critical (in our Centreons) It's quite inexplicable!
We were expected more power and therefore less load on all our machines but it is the inverse
An idea to launch a refresh of some kind?
Note that :
All shards are well replicated, no more shards no more replicats, nothing change. Just 2 more powerfull nodes...
::: {es-prod-node-prod2}{ZW_3xChXRM2yB5ZS9KTWMQ}{10.91.6.66}{10.91.6.66:9300}{master=false}
Hot threads at 2017-10-06T06:53:13.120Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
73.9% (369.3ms out of 500ms) cpu usage by thread 'elasticsearch[es-prod-node-prod2][[bypath-index-external-contacts-201709300100][16]: Lucene Merge Thread #4]'
3/10 snapshots sharing following 24 elements
sun.nio.ch.NativeThread.current(Native Method)
sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:727)
sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:716)
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:179)
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
...
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666)
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
3/10 snapshots sharing following 16 elements
org.apache.lucene.codecs.lucene54.Lucene54DocValuesProducer$2.get(Lucene54DocValuesProducer.java:502)
org.apache.lucene.codecs.lucene54.Lucene54DocValuesProducer$8.valueAt(Lucene54DocValuesProducer.java:869)
org.apache.lucene.codecs.DocValuesConsumer$4$1.setNext(DocValuesConsumer.java:522)
...
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4086)
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666)
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
2/10 snapshots sharing following 15 elements
org.apache.lucene.store.DataOutput.writeBytes(DataOutput.java:52)
org.apache.lucene.util.packed.DirectWriter.flush(DirectWriter.java:86)
org.apache.lucene.util.packed.DirectWriter.add(DirectWriter.java:78)
...
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:105)
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4086)
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666)
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
2/10 snapshots sharing following 13 elements
org.apache.lucene.codecs.DocValuesConsumer$4$1.hasNext(DocValuesConsumer.java:497)
org.apache.lucene.codecs.lucene54.Lucene54DocValuesConsumer.addNumericField(Lucene54DocValuesConsumer.java:243)
...
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666)
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
9.6% (48.1ms out of 500ms) cpu usage by thread 'elasticsearch[es-prod-node-prod2][search][T#8]'
10/10 snapshots sharing following 2 elements
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
8.2% (40.9ms out of 500ms) cpu usage by thread 'elasticsearch[es-prod-node-prod2][search][T#7]'
10/10 snapshots sharing following 2 elements
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
::: {es-prod-node-prod5}{CnqYBPyiSvKjK04iMFTPBA}{10.91.157.202}{10.91.157.202:9300}{master=false}
Hot threads at 2017-10-06T07:04:21.034Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
66.4% (331.9ms out of 500ms) cpu usage by thread 'elasticsearch[es-prod-node-prod5][search][T#5]'
2/10 snapshots sharing following 2 elements
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
13.7% (68.6ms out of 500ms) cpu usage by thread 'elasticsearch[es-prod-node-prod5][search][T#10]'
2/10 snapshots sharing following 2 elements
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
33.5% (167.6ms out of 500ms) cpu usage by thread 'elasticsearch[es-prod-node-prod3][[bypath-index-external-contacts-201709300100][17]: Lucene Merge Thread #168]'
4/10 snapshots sharing following 21 elements
::: {es-prod-node-prod8}{8Rb--vEmT0W4iGBqX53rYw}{10.91.145.3}{10.91.145.3:9300}{master=false}
Hot threads at 2017-10-06T07:04:21.033Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
81.9% (409.6ms out of 500ms) cpu usage by thread 'elasticsearch[es-prod-node-prod8][bulk][T#6]'
3/10 snapshots sharing following 30 elements
java.util.zip.Deflater.deflateBytes(Native Method)
java.util.zip.Deflater.deflate(Deflater.java:432)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.