Duplication in Filebeat to Elasticsearch data pushing

john.akash · December 7, 2017, 4:19am

Hi Team,

I am using a solution that reads JSON logs by filebeat and push the data to ES without logstasg. While I am checking I can see duplication in documentation.

Could you please let us know how can we configure a uniq id (similar to the document id in logstash "document_id => "%{eventid}-%{time}"") to avoid duplication in this method?

steffens · December 7, 2017, 12:48pm

Setting IDs or Unique IDs are not yet supported. Deduplication via IDs is something we will definitely want to support in the future.

The duplicate events are normally a sign of I/O errors while ingesting data. Have you checked your logs for I/O errors? Maybe it's a matter of queue/bulk size for processing events in Elasticsearch.

john.akash · December 7, 2017, 7:25pm

Could you please provide us on ETA on that?

steffens · December 8, 2017, 11:59am

No ETA. Some related work is done in the PRs #5811 and #5844, but there is still some more work required for supporting setting document IDs in beats.

john.akash · December 8, 2017, 7:05pm

@steffens Ok, cool. Waiting for that release.

system · December 28, 2017, 4:19am

This topic was automatically closed after 21 days. New replies are no longer allowed.

Topic		Replies	Views
Duplicate events with filebeat -> logstash -> elasticsearch pipeline Logstash	6	2353	November 28, 2017
Filebeat deduplication fail to update index Beats filebeat	8	1066	June 18, 2020
Duplicated date in my elastic Logstash	6	315	November 1, 2022
How can i process the duplication id or custome document_id on filebeat? Beats filebeat	2	738	August 2, 2017
Duplicate Events in filebeat/auditbeat Beats	1	439	October 31, 2017

Duplication in Filebeat to Elasticsearch data pushing

Related topics