High CPU Usage and Load on Data Nodes

We have 5 nodes cluster --

3 Master Nodes - 2 Cores, 4 GB RAM - 2 GB Heap
2 Data Nodes - 2 Cores, 16 GB RAM - 8 GB Heap

Total Number of primary Shards -

Data Nodes are constantly having high CPU(~90%-95%) and load average (3-5) but not much memory and heap usage.

Hot threads logs

Do We need to increase CPU cores or add new data node, please suggest.

Is there anything else running on your data nodes?
What does your disk io on the data nodes look like?
iostat -x 2 10
or
sar -d 2 10

or even just top and see what % wa (wait) you might have.

Are you doing more indexing or more searching?

Hey Peter, Thanks for replying.

iostat -x 2 10

Linux 4.9.51-10.52.amzn1.x86_64 (es-data204.ops.use1d.i.riva.co) 11/15/2019 x86_64 (2 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
57.77 0.00 5.07 7.17 0.06 29.94

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 0.76 0.05 0.88 1.91 25.55 29.58 0.03 32.67 3.05 0.28
xvdf 0.00 589.66 0.75 320.30 174.42 11578.04 36.61 1.14 3.53 0.68 21.96

avg-cpu: %user %nice %system %iowait %steal %idle
39.80 0.00 3.53 9.07 0.00 47.61

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 1.00 0.50 1.00 8.00 16.00 16.00 0.00 0.00 0.00 0.00
xvdf 0.00 537.50 0.00 451.50 0.00 56632.00 125.43 5.01 11.10 0.45 20.40

avg-cpu: %user %nice %system %iowait %steal %idle
75.38 0.00 2.51 2.51 0.00 19.60

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 2.50 0.00 11.00 0.00 128.00 11.64 0.00 0.18 0.18 0.20
xvdf 0.00 426.50 0.00 258.50 0.00 6076.00 23.50 0.22 0.86 0.67 17.40

avg-cpu: %user %nice %system %iowait %steal %idle
34.94 0.00 1.27 9.62 0.00 54.18

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdf 0.00 546.00 0.00 324.50 0.00 12484.00 38.47 1.45 4.46 0.62 20.20

avg-cpu: %user %nice %system %iowait %steal %idle
46.73 0.00 2.51 7.04 0.00 43.72

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdf 0.00 534.50 0.00 257.00 0.00 6916.00 26.91 0.21 0.82 0.63 16.20

avg-cpu: %user %nice %system %iowait %steal %idle
38.58 0.00 4.06 7.87 0.25 49.24

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 1.00 0.00 1.00 0.00 16.00 16.00 0.00 0.00 0.00 0.00
xvdf 0.00 559.00 0.00 252.50 0.00 7108.00 28.15 0.25 0.91 0.70 17.60

avg-cpu: %user %nice %system %iowait %steal %idle
35.70 0.00 3.80 9.62 0.00 50.89

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdf 0.00 569.00 0.00 335.00 0.00 14292.00 42.66 0.54 1.61 0.57 19.00

avg-cpu: %user %nice %system %iowait %steal %idle
35.62 0.00 1.78 8.91 0.00 53.69

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdf 0.00 578.50 0.00 282.00 0.00 7484.00 26.54 0.23 0.82 0.60 16.80

avg-cpu: %user %nice %system %iowait %steal %idle
76.19 0.00 12.78 1.25 0.25 9.52

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 0.50 0.00 1.00 0.00 12.00 12.00 0.00 0.00 0.00 0.00
xvdf 0.00 415.50 0.00 221.00 0.00 9204.00 41.65 0.26 1.18 0.67 14.80

avg-cpu: %user %nice %system %iowait %steal %idle
52.02 0.00 3.03 5.30 0.00 39.65

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdf 0.00 458.00 0.00 234.50 0.00 6080.00 25.93 0.23 0.99 0.72 16.80

We are mostly indexing and very less searching.

The disk io metrics are not too extreme here.
I'd probably bump cpu cores up by 2 per node here and see how that goes.
Load average of >2 when you only have 2 cores can mean processes are waiting on cpu time but can also mean waiting on io.

Elasticsearch is a very threaded application so more cores can often help.
The io stats don't seem too bad, there's not too much utilisation (%util) or disk queuing generally, but there is spikes to high cpu. If the io stats were any worse then an additional node would provide more disk resource to cater for the indexing workload and might still be a good idea anyway.

The other comment I'd have is with a 5 node cluster, it's probably better to set all nodes as master-data to provide a larger pool of resources for the data to utilise than resources of 3 nodes doing only master duties. Setting all nodes to master/data and setting minimum master nodes accordingly for that master-eligible count is what I'd try first. Dedicated master nodes are more useful at higher node counts where there's more overall node and cluster state to coordinate and manage.

Aiming for a shard count as a low multiple of the number of nodes can avoid some nodes having more shards than others to spread write load more evenly.
For 2 data nodes probably either 1p1r (1 primary, or 2p1r so 2 shards or 4 shards total, spread across 2 nodes. How many indices/index patterns are you indexing to for your "current" indices?
For 5 master/data nodes 5p1r might be good or if you have a few sets of index patterns being indexed to then even 1p1r or 2p1r can be good if all those indices are getting similar amounts of indexing traffic. You basically want to try to spread that write workload evenly across the disk resource you have available if possible for maximum performance in indexing.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.