Excessive RAM usage, gets OOM killed


#1

Hi !

I'm running 4 basic pipelines on logstash with persisted queue. Everything ran smoothly, but the output went down for a certain amount of time, and the queue filled up, up to the 1GB limit. When the output went back online, logstash isn't able to work properly and flush the queue to the output : even if it is configured to use 1GB of RAM (default jvm.options file), it uses more than 5GB, and end up getting killed by the OOM killer. I tried tweaking it, letting it use 5GB, 300MB, etc, but whatever I do, it seems that logstash simply ignores the RAM limit I gave him, and end up using too much RAM, util it get killed.

I face this issue on both logstash on premise, and on docker.

How would I be able to control RAM usage, to prevent logstash from being killed on a machine that has ~5GB of RAM free, and still keeping the persisted queue ? Is there anything I have to do on jvm.options, or on the queue page size options ?

Thanks in advance for your help,

Cyril


(Aaron Daisley) #2

Could you use VisualVM to capture heap usage? I'm having a similar issue and would be interested in seeing the graphs.


#3

Not sure on how to use VisualVM, how does it works ? I don't know/use java appart from logstash, that may explain some of the troubles I have


(Aaron Daisley) #4

This should give some info on it:
https://www.elastic.co/guide/en/logstash/current/tuning-logstash.html

Provided you're using docker-compose if you add this under environment.LS_JAVA_OPTS

-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9010 -Dcom.sun.management.jmxremote.rmi.port=9010 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=[HOST NAME HERE]

You should be able to install VisualVM, make a connection to your host and then start the Logstash server up and connect via a new JMX connection. I'm not too knowledgeable of the software unfortunately.

You'll also see heap usage if you have xpack monitoring enabled on your Logstash instance.


#5


Here's a screenshot from Kibana of the last hour of my logstash's instance. Notive on how it rebooted many times before "stabilizing" itself for the last 15 minutes somehow. It is kinda weird ...
It is still running on a test I made with max heap of 300MB, as it can be seen on the heap usage graph. This particular machine has 3GB of RAM, 300MB of logstash, but it still got killed many times in the last hour.

However, here are my queues, not emptying themselves :

du -h
584M    ./external_out
337M    ./main
57M     ./external_in
16K     ./internal
978M    .

#6

Any hints ?

I can't get logstash to work, even with a lot of RAM :frowning:


#7

Bump !

Logstash is still crashing, sometimes with an error message about creating new native threads, sometimes not, sometimes it's a pipeline which dies for no reason ...
It's really frustrating, it seems that logstash wasn't made to be used ?