Something want to know about logstash persisted queue

Normally filter workers == cpu cores.

For any queue type, 8 workers will take a batch of 125 (default) from the queue = 1000 in-flight Each worker loops around and takes another batch. The loop speed of any one worker is affected by the data complexity the filters complexity and the RTT to all outputs. The inputs operate in a loop reading files or sockets and putting events one by one into the queue. If the inputs loop speeds are higher than the filter workers loop speeds then the queue grows up to the limit - at which time the inputs loops stop waiting for write permission.

For memory queue the size is batch_size * workers * 2 = 125 * 8 * 2 = 2000 events. 2000 events can fill 1KB or 10GB - it depends on how much data is in those events.

The persistent queue uses memory mapped files. It creates a read/write Page of configurable size (default 250MB - will decrease in 6.1 or 6.2) as a mmap file. The read/write Page is written to in off-heap memory and when full a new read/write Page is created and the old read/write Page is made readonly. If the queue is almost empty then the events are written and read almost immediately from the one read/write Page. If the queue is bigger than one page, then only the read/write Page plus one or two read-only Pages are actually in memory because if a Page is not being read from or written to it is unloaded. So the persistent queue size is limited by disk space or hard limits if you add those to the settings file. As events must fit in a Page (no cross-page spanning) if any event is bigger than 250MB then you must increase the page size to suite. Bear in mind though bigger Pages take longer to fsync.

Periodically the Page is fsynced. As a filter worker loops around it acknowledges the batch it was working with, discards it and reads a new batch. Along with the fsync of the Page file, a v small bookkeeping file we call a Checkpoint is atomically (over)written to disk. The Checkpoint holds info about where in the page the write and acked pointers are - so on a restart Logstash can replay any any events not acked.

I hope this answers your questions.

1 Like