Out of memory exception under high load

Hi,

Our logstash servers are only configured to use in-memory queue since we are still prototyping. We have experienced a few times that the logstash process exited because of an out-of-memory exception.

This seems only to happen when logstash is under high load. For instance, after Elastic was stopped for a few hours and therefore a huge backlog (millions of messages) from many filebeats and winlogbeats. When this happens, the memory usage of logstash will keep going up to the max available heap (10GB) and then eventually it will throw out-of-memory error.

I would like to confirm, with memory queue only:

  1. Will logstash be able to stop accepting new messages when the queue is full?
  2. Will it apply back-pressure to slow down the beats (such as filebeat, metricbeat, and winlogbeat) when the queue is full?

If answer to the above questions is yes, do you have any suggestion why we are experiencing the crashing behavior? I would expect when the queue is full or close to full, it will apply back-pressure to all the beats or even stop accepting new messages. But seems like this is not the case.

Maybe only persistent queue can do that?

Any help would be much appreciated! Thanks!

Ed

Which version of Logstash are you using? Are you using the grok filter?

I have seen reports of a memory leak within the grok filter, which I believe has been fixed in version 4.0.3 of this filter. You can check which version of the plugin you are using through the following command: bin/logstash-plugin list --verbose

If you are on an older version, you can update that plugin and see if that makes a difference.

Thanks Christian!

Yes we have quite a few grok filters and they are at version 4.0.2. I have looked at the link that you provided and it's exactly the behavior we are experiencing. Once we switched to persistent queue it seems fine so far.

Do you know if the memory leak only applies to memory queue?

Other than the possible memory leak issue, the back-pressure handling should work the same way regardless of memory queue or persistent queue, is that correct?

Thanks,
Ed

If you upgrade your grok filter I believe you should be fine.

If you have persistent queue enabled, Logstash will naturally be able to enqueue much more data compared to when running with the very small internal queue, so back-pressure would be delayed.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.