a week ago we build a small logstash "cluster" containing 3 logstash nodes (virtual machines) and haproxys for roundrobin. It worked really nice so we tried out persistent queue on one of the three nodes. As aspected it worked over the day (no messages where queued tho, but we could see the queue over kibana monitoring etc), so we let it enabled.
The next day we had to find out that the queue was completly full. Something happend at 00:20 and the queue (50GB) was full in one hour. It stayed full for 6-7 hours, then the messages were processed.
We couldnt figure out what happend, but we thoguht that something on this time had send a lot of messages. So for debuddig we disabled the logstash node in haproxy so there wont comming any messages in on that node.
Weirdly...The same thing still happens. Not in this size ratio, but we can still see that the queue usages goes up at 00:20. Even without any messages comming in.
We have absolutley no idea whats happening here? Do you have a hint or idea?
Our next try would be to add an hardware logstash node to that cluster with enabled persistant queue to see if this behaviour is the same or not.