Hi folks, I'm trying to understand what the best setup is to guarantee zero lost messages.
I know logstash will not purposely lose messages, however I have seen references of people pulling in Redis to ensure they don't lose messages, which seems like an odd requirement. I thought in this configuration redis is just acting like a bigger 'insert' buffer into Elasticsearch, since it can get slow when dealing with lots of inserts (I'm assuming elasticsearch itself will not drop inserts if its under heavy load and instead just gets slow).
I read that logstash uses a 20 message buffer, broken into 3 phases. (equalling a total of 60 messages in the pipeline).
Question: if logstash crashes with a full pipeline, when it comes back online will it pick up those messages again? or does it treat those messages as already processed?
If you have specific questions for Elasticsearch, we can address them in Elasticsearch category in discuss
@warkolm is correct. We are working on making LS resilient to crashes w.r.t. message losses. Today, LS does not offer any message guarantees like at least once or at most once. The software's core philosophy is to not lose messages intentionally, so we try hard to fix bugs which result in crashes. For this reason, the internal buffer between the different stages is capped to 20 events. So at most, there are 40 events in-flight (in memory) in LS that can be lost when there is a hard crash. Using certain message brokers between the LS stages (shippers and indexers) can mitigate this message loss. For example, if you use Apache Kafka, it provides a way to replay messages which were not committed to Zookeeper (using the LS input). LS 1.5 natively supports Kafka, so this is an option.
For 2.0, we are working on persisting these in-flight messages to disk, so they can be recovered after a hard crash
Correct, but there are only 2 queues: input -> filter, filter -> output of capacity 20.
Not necessarily. Message queues can introduce latency when you are indexing events, but not by a big margin. Also if message loss is a concern, you need to work with some latency.
One follow up question, I know in some of our previous usage (we are using the multi-line plugin), we found some super large data rows in elasticsearch. Is there a way to put a hard cap on message length, to ensure:
Logstash does not attempt to say process a massive message that 'cannot fit into memory'
prevent logstash from attempting to insert a 'currupted' i.e. too long of a message into elasticsearch.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.