We are using 63 data nodes, 3 master and 5 data nodes…sharing the machine specs for each category:
- data node: 320gb ssd, 20 cores, 53gb memory per node
- co-ordinator node: 4 cores, 53gb memory per node
- master node: 4 cores, 24gb memory per node
We are using 63 data nodes, 3 master and 5 data nodes…sharing the machine specs for each category:
Odd sizes. Are these VMs?
Is the SSD local or accessed via network?
Since you wrote:
I presume we are talking here about either some VMs, or cloud instances, or similar, right?
[ *Life was so much simpler when ppl could say "it's a Dell PowerServe 220, or a Hp ProLiant G36, ...". ]
That 320gb SSD, is it locally attached (in the very real sense) to the host and passed thru to the VM, or its some AWS "EBS storage type", or ... ? Or is it maybe "320gb from a large pool of SSD storage provisioned by the VMware team to the VMs", like happens in a typical corporate environments. The point isn't really to know the precise detail, but to get a sense.
And you have access to run commands like iostat on the data nodes? if so, what do they show?
btw, I am a details guy, and
Not exactly recent data, is it? ![]()
Yes, these are VMs…we have the capability to configure machines of any specs.
For local or network, let me get back on that.
Yes, we are using VMs…let me get back with the exact details of the questions you’ve asked…and as for the hot threads part, yeah, it’s not recent, but it’s the same even now…I just shared the log which I had copied and pasted in my tracker.
As @RainTown pointed out how you access your SSD storage is very important. On AWS there is several tiers of SSD backed EBS and the cheapest types do not provide anywhere near the performance you would see from a local SSD of good quality.
If you can try to run iostat on the hot data nodes and check await, IOPS and other I/O related statistics.
The 320gb SSD is locally attached to the Elasticsearch process and yes, we have access to run the iostat command.
Do you want to see the iostat response during the periods of CPU spike, or shall I provide you the current point in time information?
Yes sure, I can do that…sharing the current point in time logs (currently cpu is around 30%).
Linux 5.10.0-33-cloud-amd64 12/12/25 _x86_64_ (20 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
11.49 0.14 1.14 0.08 0.00 87.13
Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd
vda 8.01 1.79 42.89 15.08 30514728 730631874 256923132
vdb 249.06 149.36 4320.12 124.47 2544293144 73589851768 2120169144
249 tps is not a lot, but we dont know if it was under any stress for that period.
Please run on the 15 "hot" nodes:
iostat -t -c -d /dev/vdb -x 10 360
This will take an hour. Hopefully it will include a period where the node is showing high (90%+) CPU. if it doesn't, then run it again for another hour. repeat until you get an hour where there such a period, and please share that hours data for the hosts impacted.
Sure, thanks a lot for the command, I’ll run it and share the results.
This is what we got at around a 60% cpu spike…I’ll try to share one for 90% as well, but seems like we didn’t get enough writes today haha.
12/12/25 19:04:30
avg-cpu: %user %nice %system %iowait %steal %idle
30.42 0.27 3.32 0.55 0.12 65.32
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
vdb 711.00 48579.20 0.00 0.00 0.23 68.33 533.30 7262.80 1280.80 70.60 0.14 13.62 0.00 0.00 0.00 0.00 0.00 0.00 246.40 0.08 0.26 54.52
12/12/25 19:04:40
avg-cpu: %user %nice %system %iowait %steal %idle
30.33 0.21 4.15 0.60 0.12 64.58
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
vdb 494.30 40285.60 0.00 0.00 0.24 81.50 832.10 7205.60 947.90 53.25 0.11 8.66 0.00 0.00 0.00 0.00 0.00 0.00 480.40 0.07 0.25 66.32
12/12/25 19:04:50
avg-cpu: %user %nice %system %iowait %steal %idle
25.46 0.00 3.21 0.84 0.14 70.35
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
vdb 370.80 32260.40 0.00 0.00 0.27 87.00 1736.60 42340.00 5236.80 75.10 0.13 24.38 0.00 0.00 0.00 0.00 0.00 0.00 869.30 0.08 0.39 65.16
12/12/25 19:05:00
avg-cpu: %user %nice %system %iowait %steal %idle
38.40 0.26 6.25 1.01 0.26 53.82
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
vdb 655.90 44006.00 0.00 0.00 0.36 67.09 1745.90 84306.80 14319.00 89.13 0.24 48.29 0.00 0.00 0.00 0.00 0.00 0.00 775.80 0.13 0.77 73.76
lets see what higher CPU load stats says, but already:
You say "around a 60% cpu spike" but the logs show CPU, and highest I see is 38%, measured over 10 seconds. How long was the spike to "60%"? A couple of seconds?
The last set shows %util at 73%. That means for 73% of that 10 second, there was an IO request being processed. r/s : w/s : f/s were 655.90 : 1745.90 : 775.80. Thats respectable, but nothing remarkable and it's already at 73% util.
Whats the refresh_interval on the relevant index? Are you using refresh=true?
The refresh interval is 30seconds
If possible could you please explain a bit about what these terminologies mean and how do we determine if these can lead to a cpu spike or not
, that will help us in understanding elastic in more depth as well.
Does this mean implementing bulk indexing requests as outlined here?
How are you batching up the data? Are you using a message queue or some other method?
Can you provide some more details about what you did in this layer and how it works?
Is it related to the comment I made about frequent updates to individual documents?
We are batching the data in 2 ways -
It’s a redis cache wherein we are using a key-value sort of data structure so that the data for 1 key doesn’t create another upsert request for another 30 seconds…this is because in our case whenever we experience a higher number of writes, 90% of the IDs are duplicate, therefore this key-value sort of data structure works.
@RainTown as we disuccsed on the pm, can you please guide me how this can be an IOPS issue. I don’t have much knowledge about that, if you could please suggest me how to go in depth and figure out if it’s actually an IOPS issue. And if yes, then how can we aim to solve it?
That bit is easy. Get faster disks.
OK. Then it sounds like you have addressed the potential issue around frequent updates and implemented bulk requests.
You mention that you have a high indexing load. What does the query load look like? How many concurrent queries do you need to support? What is your latency target/limit? How is the cluster performing from a query perspective now (for both hot and cold IDs)?
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.