Moving from Logstash to Filebeat => no duplicate log

Hi everyone,

I would like to move from logstash as a log shipper, to filebeat.

I'm using the logstash file input plugin to collect logs, same thing with filebeat, to send everything to a centralized logstash-shipper before writing to elasticsearch.

The thing is, if I shutdown logstash and start a fresh filebeat instance instead, filebeat will start from the beginning of the file, leading to duplicate logs in Elasticsearch.

I could have add a "log content hash based" on logstash-shipper side in elasticsearch document_id to avoid duplicates, but I have to admit I'd like an easier solution.

Would you have any idea on how to "bootstrap" a filebeat instance with logstash file cursor maybe ?

Thank you

I never tried it but I think it should be possible to write a small script in your preferred language that takes the sincedb from LS and converts it into a filebeat registry file. An alternative is using tail_files in filebeat, but if during shutdown LS and boot up Filebeat log lines were added, these are lost.

Other solution could be, that you write filebeat logs to a different index and then manually check (based on the timestamp?) what the time range of the duplicated events is and then use delete_by_query to remove these from of the two indices. That would mean running both for a certain. This is also what I would recommend.

Hi ruflin,

Thanks for such a quick answer !

Maybe loosing some logs using tail_files will be acceptable for my client.
Converting the sincedb into a filebeat registry file seems ok to me, I'll look into it.

Thank you

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.