Logstash with Kafka input - enable_auto_commit settings to minimise data loss?

ankh · January 21, 2024, 11:38pm

am using Logstash running in Kubernetes to ingest data from Kafka and write to Elasticsearch.

If a Logstash instance ends abnormally while processing data, it can result in data loss. It appears there is no end-to-end acknowledgement of processing available ([Meta] End to End ACKs / Queueless Mode · Issue #8514 · elastic/logstash · GitHub).

What are the best settings to minimise data loss?

Currently I have not set enable_auto_commit, so it is defaulting to true.

I see the documentation (Kafka input plugin | Logstash Reference [8.12] | Elastic) says:

Default value is true

This committed offset will be used when the process fails as the position from which the consumption will begin.

If true, periodically commit to Kafka the offsets of messages already returned by the consumer. If value is false however, the offset is committed every time the consumer writes data fetched from the topic to the in-memory or persistent queue.

It sounds like this would only solve the problem of Logstash reading from the topic, and the offset being committed before it could then write to the in-memory queue. Is that correct?

What is the best I can do to avoid data loss within Logstash on Kubernetes? Is it:

Set enable_auto_commit to FALSE, and
Enable persistent queues

Or something else?

system · February 18, 2024, 11:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
At least once delivery setup with logstash kafka Logstash	5	1638	March 12, 2019
Logstash not committing offset to Kafka Logstash	1	1563	April 6, 2017
Enable_auto_commit was false for input-kafka Logstash	1	711	August 6, 2018
What happen when logstash kafka input die? Logstash	2	1235	July 6, 2017
Do we need persistent queues if using Kafka input? Logstash	1	339	November 22, 2018

Logstash with Kafka input - enable_auto_commit settings to minimise data loss?

Related topics