CPU utilization crossing 98% always

Hi Team,

Please suggest.. Observing high CPU utilization on cluster node on one server and queried for nodes hot thread, Below is the response received please help.

Because of high CPU, By search is slowing down i mean taking minimum of 3 -4 min time

 Hot threads at 2019-02-18T10:10:58.536Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
   140.6% (703.1ms out of 500ms) cpu usage by thread 'elasticsearch[Prod-Node-Three-25][search][T#1]'
     5/10 snapshots sharing following 15 elements
       org.apache.lucene.codecs.blocktree.IntersectTermsEnum.next(IntersectTermsEnum.java:454)
       org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:114)
       org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157)
       org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59)
       org.elasticsearch.search.fetch.matchedqueries.MatchedQueriesFetchSubPhase.addMatchedQueries(MatchedQueriesFetchSubPhase.java:123)
       org.elasticsearch.search.fetch.matchedqueries.MatchedQueriesFetchSubPhase.hitExecute(MatchedQueriesFetchSubPhase.java:80)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:194)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:516)
       org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:868)
       org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:862)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
       java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
       java.lang.Thread.run(Unknown Source)
     4/10 snapshots sharing following 17 elements
       org.apache.lucene.util.fst.FST.findTargetArc(FST.java:1195)
       org.apache.lucene.codecs.blocktree.IntersectTermsEnum.pushFrame(IntersectTermsEnum.java:174)
   )
     unique snapshot
       org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.decodeTerm(Lucene41PostingsReader.java:206)
       org.apache.lucene.codecs.blocktree.IntersectTermsEnumFrame.decodeMetaData(IntersectTermsEnumFrame.java:289)
       org.apache.lucene.codecs.blocktree.IntersectTermsEnum.docFreq(IntersectTermsEnum.java:195)
       org.apache.lucene.search.ConstantScoreAutoRewrite$CutOffTermCollector.collect(ConstantScoreAutoRewrite.java:132)
   
   100.0% (500ms out of 500ms) cpu usage by thread 'elasticsearch[Prod-Node-Three-25][search][T#6]'
     2/10 snapshots sharing following 18 elements
       org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:79)
       org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
       org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
     5/10 snapshots sharing following 15 elements
       org.apache.lucene.codecs.blocktree.IntersectTermsEnum.next(IntersectTermsEnum.java:454)
       org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:114)
       org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157)
     2/10 snapshots sharing following 17 elements
       org.apache.lucene.util.fst.FST.findTargetArc(FST.java:1195)
       org.apache.lucene.codecs.blocktree.IntersectTermsEnum.pushFrame(IntersectTermsEnum.java:174)
       org.apache.lucene.codecs.blocktree.IntersectTermsEnum.next(IntersectTermsEnum.java:444)
       org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:114)
     unique snapshot
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:122)
       org.apache.lucene.util.fst.ByteSequenceOutputs.read(ByteSequenceOutputs.java:125)
       org.apache.lucene.util.fst.ByteSequenceOutputs.read(ByteSequenceOutputs.java:35)
       org.apache.lucene.util.fst.Outputs.readFinalOutput(Outputs.java:77)
   
   87.5% (437.5ms out of 500ms) cpu usage by thread 'elasticsearch[Prod-Node-Three-25][search][T#7]'
     5/10 snapshots sharing following 15 elements
       org.apache.lucene.codecs.blocktree.IntersectTermsEnum.next(IntersectTermsEnum.java:454)
       org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:114)
       org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157)
   
     5/10 snapshots sharing following 15 elements
       org.apache.lucene.codecs.blocktree.IntersectTermsEnum.next(IntersectTermsEnum.java:444)
       org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:114)
       org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157)
       org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59)

Hi,

In order for the community to help you :

  • what version of elasticsearch are you using ?
  • what os are you using ?
  • how many index / shards / replicas do you have ?
  • how much memory do your system have ?
  • how much memory is allocated to elasticsearch's JVM ?
  • how many cores are actually allocated to elasticsearch ?
  • what is your usecase & what kind of query do you use most (aggregations, "regular" searches, "scrolling" searches, ...) ?

Alternatively, you can run the following command to get pretty much anything we need to know :

GET /_cluster/stats?human&pretty

Furthermore, can you run the following commands :

GET /_cat/pending_tasks?v
GET _tasks
GET /_cat/thread_pool?v

What they do is quite simple :

  • the first command gives an overview of pending cluster tasks cluster-wide. Its documentation can be found here (or there).
  • the second one gives you the opportunity to get more details on pending task initiated by both the user and the cluster. Its documentation can be found here.
  • the last one returns the state of various queues used by elastic internally, which can be quite useful at times. Its documentation can be found here.

If you have access to elastic support, you can also run the support-diagnostic tool, so they can help you through this issue.

Best regards,

Charles.w

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.