I've purchased the Logstash Book and try to configure a test farm. It is recommended at the Book using a Redis between shipper and logstash. Now I configure filebeat and read here that "Redis output for the Beats is deprecated, because it is compatible with the Redis input plugin".
I do not understand this: an output plugin is deprecated because it is compatible with an input plugin.
And, is it still recommended to use Redis in between a shipper (filebeat in this case) and logstash?
Or is it recommended from now on not to use Redis?
Unfortunately as you pointed out, the documentation here is not very clear. What it means is, that instead of sending data directly to Redis, you can use LS to send it to Redis. So your setup would look like FB -> LS -> Redis -> LS -> ES. In general we don't recommend anymore to use redis and send data directly to LS. What is the reason in your use case that you need Redis?
How do you handle fault tolerance while there is no Redis in between?
As far as i know Logstash only have a queue size of 20...
For example: We currently handle ~1TB Log data per day and in peak times Redis is buffering quite good the load until Logstash and the Elasticsearch-Cluster have processed the data. Sorting out the Redis boxes means for us to scale logstash boxes/instances and also to scale Elasticsearch cluster (only for the peak times). The rest of the day the additional hardware is not needed?! For me this sound not cost effective....
filebeat has send-at-least once semantics. It will not drop any log lines if logstash or elasticsearch become unresponsive or lines are written faster then can be processed at peak times. Back pressure from logstash/elasticsearch is directly applied to filebeat. As long as logstash/elasticsearch can absorb all per day logs within a day and log files are available until it has been indexed, you should be fine.
Logstash returns an ACK once it has received log-lines. In filebeat, only lines having ACKed, will eventually (most likely) not be resend to logstash. Lines not ACKed will be send again.
The only reason is that the Logstash Book recommends using it (between shipper and logstash).
OK, as I understand it now, this recommendation is true unless the shipper is filebeat.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.