Previously, we have been running 6 different logstash processes on each Logstash server (5 shippers and 1 indexer) and 1 Redis process. The shippers each had their own configuration file, listening for data incoming on specific ports, and the output section was configured to send to one of two Redis servers (for load balancing and failover protection). The indexer process then had an input of Redis and output to Elasticsearch.
With ES 2.x, we decided to standardize and use a single logstash.conf file and therefore a single logstash process + 1 Redis process. Which should make things easier to manage, handle, and monitor. But I can't find any examples of using Redis in the middle like this.
But how do I specify multiple options for the output section. Specifically, what values can I key off of to specify that the shippers (syslog, snmptraps, beats, logs, etc) all ship their data to Redis first. Then only redis output gets sent to Elasticsearch.
So just to confirm. The standard process, if you are using Redis, would be to have a shipper config that has the inputs for all incoming streams, appropriate filters, and an output to Redis. Along with a separate indexer config that has an input of Redis with an output of Elasticsearch.
So, the shipper config shouldn't have any filters and the data should just be passed to Redis. Then the indexer config takes care of the filters and sending the data to Elasticsearch?
Wouldn't that put events that should be dropped through Redis unnecessarily? What are the benefits of putting the filters in the indexer config?
You get the events into your broker layer ASAP. If you have filters on the incoming part you may end up slowing things upstream down. Where as putting it after means they just queue if your existing instances are busy, and it allows you to easily scale (it's harder to scale inbound).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.