I am using a solution that reads JSON logs by filebeat and push the data to ES without logstasg. While I am checking I can see duplication in documentation.
Could you please let us know how can we configure a uniq id (similar to the document id in logstash "document_id => "%{eventid}-%{time}"") to avoid duplication in this method?
Setting IDs or Unique IDs are not yet supported. Deduplication via IDs is something we will definitely want to support in the future.
The duplicate events are normally a sign of I/O errors while ingesting data. Have you checked your logs for I/O errors? Maybe it's a matter of queue/bulk size for processing events in Elasticsearch.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.