Currently we have the following setup:
Filebeat reads log files and sends the content to Kafka. One log-line results in one Kafka event.
At the other side a Logstash reads the events from Kafka, parses them and sends the resulting document to Elasticsearch.
Filebeat runs inside a Docker container and reads the log files from a Docker data container.
We are using Filebeat 5.1.2 with Docker 1.11.2 .
The problem we are encountering is as follows:
Although the offset in the Filebeat registry is pointing to the start of a new log-line, upon restart Filebeat starts reading somewhere in the middle of the previous log-line, resulting in sending a partial log-line to Kafka and thus having parsing errors in Logstash.
According to the documentation on "How does Filebeat ensure At-Least-Once delivery ?" Filebeat should just start reading from the offset upon restart.
What could cause the behavior we are experiencing ?