Xfs memory allocation problem?

YuWatanabe · February 3, 2017, 2:35am

I would like to ask question if this xfs filesystem problem will be a related issue for elasticsearch when incoming requests becomes high.

Recently my colleague had experienced memory allocation problem using xfs on RHEL 7 when dealing with some other software.

https://access.redhat.com/solutions/532663

My colleague had temporarily eluded the error using ext3. I have not experienced this error on systems maybe because I have not faced traffic where it is high enough.

Would this error be also the case for elasticsearch? Perhaps, happens when using slow magnetic disks on high loads of traffic?

jprante · February 3, 2017, 10:13am

This error message from XFS is in fact a symptom of high memory fragmentation.

You ask for the situation when incoming requests becomes high. I'm not sure, it depends. If you mean indexing requests, which are memory intensive, then it may be relevant. If you mean just search requests, then it's maybe not so relevant, as long as the requests don't exercise the operating system's virtual memory subsystem.

Watch these diagnostics for memory fragmentation:

check ES process number of allocated memory area regions: cat /proc/<pid>/smaps | grep ^Size -c
compare with /proc/sys/vm/max_map_count

If the smaps number comes close to max_map_count, then it is troublesome.

With the default configuration, Elastiscearch is in demand for a high number of virtual memory areas (VMAs) when indexing load increases and this can lead to memory fragmentation. This memory fragmentation on the operating system layer (not Java heap) can be responsible for operating system going out of memory and JVM crashes in rare circumstances. It's very hard to track down the exact cause of the trouble. But, XFS also is detecting such problems if memory gets fragmented. It's smarter than ext3/ext4. So this issue might or might not be related to Elasticsearch under heavy load. it is not necessarily the only reason why XFS may report this message.

The ES core team's reaction to this challenge was to increase vm.max_map_count and, in order to keep customers satisfied, they added a strict check at startup time for the increased value. There is no real danger in increasing the value. But, it does not fix the underlying risk of memory fragmentation.

My Elasticsearch configuration on RHEL 7, running on physical machines in my own data center, is different from the default.

Just in short what my configuration is:

I removed -XX:+DisableExplicitGC from jvm.options. Disabling System.gc() calls will always end in deep trouble because it delays unused buffers from being garbage collected for a very long time. So this helps quick cleaning of the off-heap allocations. See http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/4a1e42601d61/src/share/classes/java/nio/Bits.java#l649 and also http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/20e6cadfac43/src/share/classes/sun/nio/ch/FileChannelImpl.java#l897
I use G1 GC which is a more aggressive GC algorithm and has better latency behavior, especially for System.gc() calls on multicore machines
because I verified in long test runs with all kinds of massive bulk indexing workload that smaps/max_map_count ratio is sound and G1 GC works as expected, I patched the ES code to avoid vm.max_map_count check at startup

Use at your own risk. Take safe steps and know what you do. It's just my experience, your experience may be different.

There are also other advanced methods for the same goal availabe on RHEL7, including the set up of HugePages to take some load from the VM allocator, see https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-Memory-Tuning.html This probably also helps Elastisearch, but I did not try HugePages yet.

YuWatanabe · February 21, 2017, 1:47am

@jprante

Thank you for the detailed reply.

For now I will first make sure I set up recommended value for max_map_count.

system · March 21, 2017, 1:47am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.