Logstash stops processing when using persistent queues

ceekay · June 6, 2018, 1:30am

I've updated logstash.yml to use persistent queues for a bit of extra redundancy, and have run into a problem where individual nodes will just stop processing messages after ~~about 30+ minutes~~ a random amount of time.

Kafka also shows the partition they're reading from as stopped.

The nodes themselves are still doing something, as they are still producing metrics, so I suspect it's related to the Kafka input somehow.

Once a node stops, the remaining nodes will pick up the messages from its Kafka partition after a ~5 minute delay, however if left long enough, all the nodes will eventually stop processing.

I really have no idea where to start looking as there is nothing of note in logstash-plain.log.

logstash.yml changes are:

queue.type: persisted
queue.page_capacity: 10mb
queue.max_bytes: 10mb
queue.checkpoint.writes: 256

If I disable persistent queues, Logstash goes back to behaving itself. Enable queues again, and I get more stoppages.

Any ideas what could be causing this?

Logstash 6.2.2-1
Kafka 2.11-0.11.0.2

ceekay · June 7, 2018, 10:47pm

Bump: any ideas? This is unusable for me at the moment.

system · July 5, 2018, 10:47pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Two year old issue with persistent queues Logstash	1	260	April 24, 2020
Pipeline with Persistent Queues stops working after some hours Logstash	3	635	April 2, 2018
Logstash: Persistent Queue Behaviour Logstash	4	767	February 22, 2021
Logstash stops processing requests Logstash	2	588	July 6, 2017
Logstash stops processing after some time Logstash	4	5473	November 19, 2018

Logstash stops processing when using persistent queues

Related topics