I think it's great that Logstash is now offering that feature, but theres a part in the description which is a bit concerning to me:
An event is recorded as ACKed in the checkpoint file if the event is successfully sent to the last output stage in the pipeline; Logstash does not wait for the output to acknowledge delivery.
So if Logstash is not waiting for the output filter to acknowledge delivery, what if - for example - the elasticsearch output of logstash is not able to deliver the event to elasticsearch due an network error?
The way I read it, and from what I remember during the design/code review of this feature, is that the output plugin will be given the events, and once that happens, we acknowledge.
In the case of Elasticsearch output having a network error, the Elasticsearch output will retry most kinds of errors. This retry mechanism will block until it is successful, which means that the pipeline will not have the opportunity to ack events in the queue until the output plugin's receive or multi_receive call has completed.
In summary, assuming my recollection of the design is accurate: An event will not be acked in the persistent queue until all outputs have finished receiving the event, so by default, most outputs will deliver their events downstream (to elasticsearch, etc) before the events are acked in the queue.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.