Logstash OOM - understanding heap sizing

guyboertje · October 11, 2017, 1:00pm

The persistent queue uses memory mapped files and these are read into off-heap memory. Such a big queue will have thousands of page and checkpoint files. Only a few of these is in memory at a time. The 'head' page, where the writes go to, and a few tail pages, where the reads and acks are done.

There is the number of 'events in flight' to consider. A batch of 125 events is read by each worker thread, say 24 workers, then 125 x 24 = 3000 events in LS at any one time - now when a batch of events are JSON serialized to bulk index to Elasticsearch the memory is doubled for that batch. As the Elasticsearch is multithreaded each workers batch can be serialised into memory.

To calculate how much memory to give to LS, one needs to know what the average and max size of your events are.
For a 1 MB average event size 3000 event can mean a 6000MB memory consumption. But each batch is a mixed bag of events though.

However, from your reported queue size of 200GB @ 130 million events we get 2e11 / 130e6 = 1538 bytes per event or 1.5KB and at that size 3000 in-flight events would consume only 9MB of heap.

Have you changed the batch size? Do you know whether you have any extremely large events? Say 100MB+.

Topic		Replies	Views
Logstash heap size Logstash	6	7010	July 6, 2017
Logstash limitting ElasticSearch heap Elasticsearch	5	453	July 6, 2017
Error: Your application used more memory than the safety cap of 4G Logstash	7	2605	February 23, 2018
Logstash Queue Logstash	23	4219	July 6, 2017
Logstash not able to handle large volume of logs Logstash	4	400	May 30, 2018

Logstash OOM - understanding heap sizing

Related topics