OOM since 8.16.1 with openjdk23

Hi,

after upgrading from 8.15.1 to 8.16.1, all machines in two of our four ES clusters are running out of Memory after 7-12 hours.

Our current setup for all clusters is:
System: Ubuntu 22
3 master
3 data nodes (64GB RAM. 32GB Xmx. 48 CPUs)
2 kibana nodes

The internal ES monitoring also does not show any issues with Heap. I can not directly upload it here, as the company does not allow it. If you need it, I can upload it to a image hoster of your choice.

This is the message I can find in my syslog:

Dec  3 01:20:46 datanode1 systemd-entrypoint[296225]: # There is insufficient memory for the Java Runtime Environment to continue.
Dec  3 01:20:46 datanode1  systemd-entrypoint[296225]: # Native memory allocation (malloc) failed to allocate 1048576 bytes. Error detail: AllocateHeap
Dec  3 01:20:46 datanode1  systemd-entrypoint[296225]: # An error report file with more information is saved as:
Dec  3 01:20:46 datanode1  systemd-entrypoint[296225]: # /var/log/elasticsearch/hs_err_pid296225.log

I can also share the JVM fatal error log and the last gc logs before the crash.
The gc.log always ends with a full pause before a crash

[2024-12-03T01:20:46.380+0000][296225][gc,start    ] GC(632) Pause Full (System.gc())
[2024-12-03T01:20:46.380+0000][296225][gc,task     ] GC(632) Using 33 workers of 33 for full compaction

I didn't change anything in the jvm.options and only set Xmx/XMs and LimitMEMLOCK=infinity

I think it could be related to the switch to openjdk 23, as we are using the java coming with ES. Do you think its worth to install openjdk22 on the system and use it? Otherwise I have no clue how to get out of this.
Even adding 18GB of ram (to 82GB) did not work - still OOM.

Thanks!

can you try 31gig ram in jvm.options?

you can also try with following two value in sysctl.conf file

vm.overcommit_memory=2
vm.overcommit_ratio=85

Thanks for the tips. I will try both of them on different machines and get back to you.

I also tried an openjdk22 on one of the machines, but it still crashed.

Even after trying all of the mentioned tipps, we still have OOM twice a day on all data nodes. Even the node with openjdk22. So it looks not an issue with the new openjdk.

We are ingesting data with logstash. Maybe the pressure on ES is too high? Should we try to decrease the batch size?

I upgraded my test bed last week and no issue so far

also using same openjdk provided by elastic

java -version
openjdk version "23" 2024-09-17
OpenJDK Runtime Environment (build 23+37-2369)
OpenJDK 64-Bit Server VM (build 23+37-2369, mixed mode, sharing)

how much data you are writing?
I have lot of data in test but I let logstash handle limit and not putting anything to it.

cat ../logstash.yml |grep -v ^#
node.name: "myhostname"
path.data: /s1/logstash
pipeline.batch.size: 256

http.host: "myhostname"
http.port: 9600
log.level: info
path.logs: /s1/log/logstash

My jvm looks like this

cat ../jvm.options |grep -v ^#
-Xms15g
-Xmx15g
14:-XX:+UseG1GC

logstash is running on it's own vm.

If a mode is consistently going to crash, you can monitor with some other old-school tools, top/htop/vmstat/..., til time it crashes, pipe output to files and look at them after the crash.

I am curious if there is growing memory pressure until it crashes, or rather something goes wild/wrong, and it snowballs very rapidly.

might be helpful

I'd just like to add I've reported the same issue happening on both of our clusters since upgrading from 8.15 to 8.17

More details are in the thread but quick summary:

  • Two completely isolated clusters of completely different sizes are showing OOM's since upgrading.
  • The search & indexing pattern is consistent as it has always been, and we've seen zero OOM's in the past year or so we've been running 8.x on these clusters, only since upgrading to 8.17 a couple of days ago have we seen 10+ OOM's across nodes
  • The OOM's seem to happen across all our hot nodes over a couple of hour period (i.e. all hot nodes will OOM once within a given period).
  • We did not change JVM version in the 8.16 -> 8.17 upgrade
1 Like

@ALIT What do you have set on your machine for the value of vm.max_map_count if you run sysctl -a?

We've always had this set to 262144 as per Elastic's recommendation

I set this to an artificially lower value in our testing environment and waited for this value to be reached by the Elastic process; the process exited with the exact same error I've been seeing.

I've now doubled this value on our clusters to see if it prevents, or delays, the OOM's we've been seeing. I've already observed that the number of memory regions being used on some of our hot nodes is already greater than the previous limit, so I'm more confident this is the source of the problem. Whether things will just grow to the next limit or not I don't know.

I would be interested if you're able to observe the same in your cluster. You can do wc -l /proc/<PID>/maps to see the current number in use

1 Like

Out of curiosity, did either @Evesy or @ALIT resolve the memory issue?

@RainTown Not seen any issues since increasing the limit mentioned above so I would say it's resolved in that sense

1 Like

Which limit did you increase? vm.max_map_count ? Whats your new limit to prevent the crash?

sysctl -a | grep max_map_count
vm.max_map_count = 262144

wc -l /proc/3919887/maps
215 /proc/3919887/maps

We just doubled it to see where that would leave us, and we did subsequently observe the used value get to the circa 400k mark after increasing

The amount being used in your output looks really low, but we did observe ours starting low and then growing over the next 24 hours of operation

I just realized, I checked the wrong process.
This is the correct process:

173802 /proc/1841515/maps

I increased it to 500k to see if it helps.

1 Like

No crash on the reconfigured node. I think you nailed it @Evesy

current stats:

466096 /proc/1841515/maps

The change in behaviour, in 8.16.1+ which both of you reported, seems worthy of a bug report to me.

@ALIT Everything still looking ok since you made the change?

I've opened Elasticsearch 8.16.x Large Increase in MMAP Counts · Issue #119652 · elastic/elasticsearch · GitHub as a bug

1 Like

I changed it on 31th Dec.
I had 3 crashes since. So the issue is still there, but a lot less crashes.

I guess I could increase max_map_count even higher to solve it, but I have no idea of the impact to the system.

It would be useful if someone can provide the hs_err_pidXXXX.log generated when it crashes, either here, on the github issue or as a gist.

It would be useful if someone can provide the hs_err_pidXXXX.log generated when it crashes, either here, on the github issue or as a gist.

@ALIT Is this something you'd be able to capture & share given you're still seeing the errors?

For us the hs_err_pidXXXX.log is written to a location that is not persisted across pod restarts so would need to make changes to make that, or move that, to somewhere persistent.