High CPU Usage on a few data nodes / Hotspotting of data

Any update on this? Did it give the expected results?

How did you go about verifying it?

If I was doing this I would consider creating a query that retreives a set of documents that are expected to have a high update frequency and periodically, e.g. every N minutes, run this through a script using cron. The script would run the query with a fixed sort order (to ensure same set of documents are retrieved if possible) while returning document _id and _version. It would store this on disk so version numbers for the documents can be compared between invokations and whenever if finds that the version number has increased by more than 2*N compared to the previous invokation it would report this in a log file.

Hey guys, we carried out 1 more experiment…we installed XFS filesystem on one of the machines and ext4 on the other and the results were almost identical except the “search latency” occurring during the writes. Here, the XFS filesystem is having nearly 0.5ms latency as compared to 5ms latency in the ext4 cluster (roughly 10 times better it seems).

Any thoughts?

I’ve also collected the Iostat and sar command outputs for all these stress testing if you wish to deep dive :smiley: .

On one of the production nodes, or one of the 2 test systems?

It is an interesting find, tho I'm a bit surprised by a 10x difference. Check mount options on both. If you are just comparing with synthetic load on test systems then ... better to test on real load.

It is certainly true that XFS and ext4 are different filesystems, with different internals and will have different performance characteristics, even more so under some specific load patterns. Maybe you have hit a sweet spot for XFS over ext4. ext4 and XFS both journal metadata, but XFS originally came from SGI and IIRC it (XFS) was designed with large SMP systems in mind.

The sar/iostat output would not help me particularly to confirm anything here.

On the 2 test systems, also sorry, it was not 10x…it’s nearly 3x :stuck_out_tongue:

Yeah, well, this type of testing is pretty hard to do, and often the hardest bit is to have something which is truly representative of your real world. IT is also full of horror stories which have a "it worked perfectly, and great, in staging/pre-production" near the start of the story.

Gaining deeper insight would likely require low-level instrumentation/tools, eBPF?, to observe IO behavior across the full stack. Given the additional complexity introduced by virtualization, it’s not obvious to me that such an investigation would be worth the effort.

XFS vs ext4 for elasticsearch is a nice topic, maybe someone from Elastic would be interested to do a deep dive, on a blog or similar. It did not take me long to find a 3rd party blog claiming XFS was better than ext4 for elasticsearch, and others suggesting to avoid XFS due to risks of kernel deadlocks/other issues. Admittedly both were from years ago.

BUT as I've said a few times in this thread, your design choice still seems your main limiter.

As I stated earlier I do not think there are any major gains from tuning the storage layer as you seem to have good performance, so I will stay away from this thread of further investigation. If there are no issues in your indexing layer (see my previous post), I suspect you will need to make significant changes in order to alleviate or resolve the issue.

1 Like