We use this to index/visualize logs coming off of a firewall
The setup is like this -
Firewall --(Systlog)-->Syslog Server--(file input)-->Filebeat-->Redis-->Logstash-->Elasticsearch cluster
Recently we had a 5x spike in logs from the firewall. This led to the redis queue being overwhelmed and becoming unresponsive.
We have managed to trace cause of the spike and address the issue. However, this got me thinking if there were any protections in place to prevent such an event overwhelming the setup again.
Here is where i am hoping to hear from the members here, is there any way we can:
a) set a limit on the number of events filebeat pushes into Redis per second b) would this be possible while not dropping the events above the limit (as the events are being read off of a file and not a stream)
If for some reason filebeat cannot output events, it will backoff on the input, since it is reading from a file it will stop reading until it can output events again.
It has an internal queue, which is on memory per default, and when this queue is full it does not accept new events until events on the queue begin to be sent to the output again.
When reading from files normally this does not lead to data loss, but on some specific cases it can happens like when the file rotates and the old file is deleted for example.
Are you required to use Redis? I have a similar setup, but using Kafka.
I had some issues with redis in the past and decided to switch to Kafka, it adds a little more overhead on management, but in my case it performs way way better.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.