1 of 10 nodes CPU bound

jazz · December 19, 2016, 5:08pm

We have a cluster with 10 machines with 10 data nodes, 3 master nodes and 10 replicated shards.
Sometimes, we run many percolations and searches, and the CPU usages raises on all nodes, but not evenly. 1 node rises to 100% CPU usages while all the others rises to 50-80% of CPU usage. This makes most searches really slow (~20-25s).

Can I determine why this node takes 100% CPU time but not the others?
Can I do something to distribute the load more evenly?

Mark_Harwood · December 19, 2016, 5:23pm

Do you use custom routing of docs? This may account for uneven-ness.

This stands out as a possible culprit. If you have a particularly nasty percolator query that might be the cause of the imbalance.
The best way to start is to use the hot threads API to see what's going on while under CPU pressure:

jazz · December 19, 2016, 5:37pm

No.

It fluctuates highly. Most of the time it shows only 3 hot threads (on a hardware with 4 hyper-threaded CPUs with ES taking 800% of CPU time).
Shouldn't it show me 8 hot threads ?

jazz · December 19, 2016, 6:03pm

OK, I read the doc...

threads: number of hot threads to provide, defaults to 3.

jazz · December 20, 2016, 1:16pm

Now I can see that all threads are doing searches.
How can I find from the call stacks what is expensive in the searches?

system · January 17, 2017, 1:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Balance Nodes by CPU Usage Elasticsearch	11	975	August 1, 2023
Uneven node load Elasticsearch	7	1934	July 5, 2017
Percolate queries and node cpu usage Elasticsearch	1	489	July 5, 2017
High CPU load Elasticsearch	10	794	May 10, 2022
Uneven CPU Load Across Cluster 30 Node cluster Elasticsearch	1	689	August 9, 2017

1 of 10 nodes CPU bound

Related topics