CPU at 99% - Need help understanding hot_threads ! pls

Camilo_Sierra · February 18, 2016, 5:42pm

85.2% (426.1ms out of 500ms) cpu usage by thread 'elasticsearch[Data 1-hot][bulk][T#1]'
3/10 snapshots sharing following 19 elements
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:401)
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:706)
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:554)
org.elasticsearch.index.mapper.object.ObjectMapper.serializeNonDynamicArray(ObjectMapper.java:685)
org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:604)
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:489)
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:493)
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:409)
org.elasticsearch.action.bulk.TransportShardBulkAction.shardUpdateOperation(TransportShardBulkAction.java:515)
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:232)
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
2/10 snapshots sharing following 18 elements
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:706)
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:554)
org.elasticsearch.index.mapper.object.ObjectMapper.serializeNonDynamicArray(ObjectMapper.java:685)
org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:604)
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:489)
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:493)

Camilo_Sierra · February 18, 2016, 5:42pm

   org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:409)
   org.elasticsearch.action.bulk.TransportShardBulkAction.shardUpdateOperation(TransportShardBulkAction.java:515)
   org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:232)
   org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
   org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
   org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:745)
 5/10 snapshots sharing following 11 elements
   org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
   org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:493)
   org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:409)
   org.elasticsearch.action.bulk.TransportShardBulkAction.shardUpdateOperation(TransportShardBulkAction.java:515)
   org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:232)
   org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
   org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
   org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:745)

Jeferson_Martins · February 18, 2016, 5:47pm

how many shards in your cluster?
are you executing snapshot in this time?

Jeferson_Martins · February 18, 2016, 5:47pm

and your free disk?

Camilo_Sierra · February 18, 2016, 5:55pm

we have 433 shards in the cluster,
but only 3 shards in the index that makes trouble, this 3 shards are splited in 3 nodes. this is the only index that indexes documents

and dont execute any snapshot in this cluster.

and in disk we have 26% used !

Jeferson_Martins · February 18, 2016, 7:23pm

is shards in same index?

there's replicas for this indexes?

are you change bulk queue and size?

Camilo_Sierra · February 19, 2016, 2:09pm

Yes same index, at the beginig they have replicad but i deleted when the problem started.
but i dont change the size or queue, but for the 18 first days it works and yesterday the traffic was normal but the CPU skyrocket

Jeferson_Martins · February 19, 2016, 4:06pm

Maybe the Elasticsearch host not support the size of your cluster.

How much RAM and CPU you are using?

In my case, I had a cluster with 10 hosts and 7Gbs of RAM each. I had a index with more than 100Gbs of data and my cluster down everyday. I decided use another kind of host in aws, with more RAM and CPU and less indexes in same time in the cluster.

Have you considered put more CPU and RAM?
Are you verified the bulk.queue and bulk.size?

/_cat/thread_pool?v&h=name,host,bulk.active,bulk.rejected,bulk.completed,bulk.queue,bulk.queueSize

Topic		Replies	Views
CPU usages 90% and ES hotthreads dump Elasticsearch	2	484	July 6, 2017
High cpu usage (90%-100%) Elasticsearch	1	331	July 6, 2017
My Elasticsearch is running at very high CPU (constantly 99%) - Need help understanding hot_threads Elasticsearch	2	1069	July 6, 2017
Elasticsearch -Understanding Hot Threads Elasticsearch	5	2998	October 26, 2018
Elasticsearch hot threads Interpretation Elasticsearch	2	1494	April 17, 2017

CPU at 99% - Need help understanding hot_threads ! pls

Related topics