Are persistent queues needed with Filebeat?


(Fran) #1

Hi all,
I'm testing a simple Elastic Stack deployment to test Logstash without persistent queues. In this deployment, Logstash only gets data from Filebeats (apache logs). Both Logstash and Filebeat run fine and seem to be really resilient.

Since Filebeat ensures at-least-once delivery, I'm not sure if persistent queues are really needed. Maybe the advantage in this case is about absorbing bursts of events .

I've tried to simulate a failure scenario but I don't know how to force an abnormal termination. Logstash manages to complete the task event if it receives kill -9, kill -2 or the machine is rebooted.

Thanks very much


(Christian Dahlqvist) #2

Without persistent queues configured, Logstash will use a small in-memory queue, which could lead to data loss if Logstash crashes. You should be able to test this by doing the following:

  1. Take a file and configure the pipeline to write this to a separate index so you easily can verify the number of events once the file has been completely processed.
  2. Configure Filebeat to read the new file and send it to a Logstash instance without persistent queue configured.
  3. Once you see events being written to the index in Elasticsearch - stop Elasticsearch. This will cause Logstash to queue up events internally and retry sending to Elasticsearch.
  4. Kill Logstash with kill -9
  5. Restart Elasticsearch and Logstash and wait until no more data is processed through the pipeline. Check the number of records that have been successfully indexed and compare this to the number of events in the file.

(Fran) #3

Hi Christian,
Great. I've just tested it following these steps and Logstash cannot process all data sent by Filebeat.

I've repeated the same test enabling persistent queues and all data are sent to Elasticsearch.

Thnks very much


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.