It is written that filebeat keeps the state of the file it reads in the registry file.
So, it should not send the same logs again and again, if the log file has not changed.
I have added a cronjob which restarts filebeat every 5 minutes, so every 5 minutes, the same log data is being sent to the elastcisearch, whereas it should not send the same data again and again, am I correct ?
filebeat.registry_file: ${path.data}/registry
This line is in the filebeat.full.yml, do I have to add it in filebeat.yml also ?
Filebeat keeps the most recent acknowledged state of files in the registry file. ACK is done by logstash/elasticsearch. As filebeat buffers lines into batches, lines read is > lines ACKed. If output does not ACK an event, it must be send again (send-at-least-once-semantics).
Have you checked filebeat logs. Was connection closed while waiting for ACK?
Can you point me to the place in docs it's saying "read"? Might be a doc-bug.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.