Elasticsearch Docker container keeps crashing with exit status of 137

Hi Everyone,

I have a docker-xpack docker image that I deploy on marathon/mesos. The image works great and is uber-stable when xplack security is disabled. When I enable xpack security and use encrypted comms, the containers crash randomly every 2-3 days with no message other than the docker is exiting with status 137. Status 137 usually means mesos killed the container because it's RAM exceeded the configured max.

I have upped both the JVM memory heap and the docker RAM via Mesos configuration from 2GB/4GB to 4GB/8GB and I am still getting exit with 137. In addition, I instrumented one of the containers by calling docker stats every 5 minutes and the overall RAM keeps creeping up and eventually reaches the Mesos limit for the container and Mesos kills it off.

Anyone else have trouble with an elasticsearch-xpack docker? Any ideas/suggestions are welcome.

Thanks

--John

What Elasticsearch version are you on?

Argh, sorry, I should have specified that--ES 5.6.4

More information. This is an Elasticsearch container with default jvm.options (and therefore 2GB memory heap) and 4 GB dedicated to the container.

The memory heap is staying steady in the 1GB range while the container is up to 3.97 GB. So, there appears to be off-heap memory that is getting total memory usage > 4 GB at which point Mesos kills the container.

--John

Interesting update. I have a cluster with xpack security enabled, same docker image but with 31 GB memory heap and 62 GB docker memory and this one has been stable for several weeks.

Question: is anyone aware of a minimum memory heap size for elasticsearch with xpack enabled?

--John

More interesting information.

I set MALLOC_ARENA_MAX=4 and that slowed down the growth of off-heap memory, but did not stabilize it.

A colleague of mine who knows Elasticsearch way more than I do updated the refresh_rate from 10s to 300s.

The combo of MALLOC_ARENA_MAX=4 and refresh_rate=300s appears to have stabilized things for now.

Will monitor and report an update in a few hours.

On one of the clusters the memory usage for a 2GB/4GB combo is 2.38 GB, so the combination of MALLOC_ARENA_MAX=4 and refresh_rate=300s appears to have stabilized things for now.

On the other cluster (with more data), the memory consumed by the ES Docker continues to increase, albeit more slowly.

Again, I can confirm that xpack is not causing this as I've seen this in both configs where xpack is enabled and also when it is disabled.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.