Hi fellow log analysts.
We are currently using Logstash for all syslog related logdata.
Currently we have the issue, that logstash (almost) stops processing logdata after some time which results in backpressure to the delivery system (rsyslog server). The elastic database is fine and immediately after a restart of logstash the data processing is OK again (throughput is good, backpressure is slowly resolving).
Our issue seems to be connected to the time when we started groking firepower logdata. Unfortunately there is no logstash logdata which helped us to pinpoint the issue. There are some errors and timeout messages from filters but we were not succefull in identification of the problematic filter (or input data in combination with the filter).
In the error state logstash itselfs stalls on (somewhat) high load level. All curl queries to monitoring requests are answered successfully but taking seconds for a response (as in contrast to the normal mode of operation where those queries are answered in milliseconds time).
The logstash system is a 12 core vm with 16 gb of RAM. In normal operation mode load is avg well below 1. In the error state avg is hold on 2. (
We currently do not know what to do to resolve the situation. (Despite from detecting the error state and restarting logstash).
Are there any known issues with logstash not surviving filter issues and stalling input processing?