Duplication in Filebeat to Elasticsearch data pushing

Hi Team,

I am using a solution that reads JSON logs by filebeat and push the data to ES without logstasg. While I am checking I can see duplication in documentation.

Could you please let us know how can we configure a uniq id (similar to the document id in logstash "document_id => "%{eventid}-%{time}"") to avoid duplication in this method?

Setting IDs or Unique IDs are not yet supported. Deduplication via IDs is something we will definitely want to support in the future.

The duplicate events are normally a sign of I/O errors while ingesting data. Have you checked your logs for I/O errors? Maybe it's a matter of queue/bulk size for processing events in Elasticsearch.

Could you please provide us on ETA on that?

No ETA. Some related work is done in the PRs #5811 and #5844, but there is still some more work required for supporting setting document IDs in beats.

1 Like

@steffens Ok, cool. Waiting for that release.

This topic was automatically closed after 21 days. New replies are no longer allowed.