I'm a beginner of this area and i want to send duplicate lines of a log file to elastic search. currently it is identifying duplicates and send only one line,
By default, elasticsearch will generate a unique document_id for each event. You are forcing it to use a document_id based on the value of the message field. If two events contain the same value of [message] they will get the same document_id.
If you want es to include duplicate messages then remove the fingerprint filter and do not set the document_id option.
Thank you for the reply, and when i'm removing fingerprint filter, the log is duplicating over and over (every refresh it is continue increasing the duplicates of the log file )
Thats why i have put a fingerprint filter here.
Please help to resolve this.
Is there a way to generate uniq id field and bind to fingerprint for this ?
In that case I honestly do not understand what you are asking for. You insert a fingerprint filter to eliminate duplicates, then ask how to make sure that duplicates are retained.
What do you mean by "every refresh" and what does the filebeat configuration look like? In fact this may be a filebeat question rather than a logstash question. It may be that the log creation/rotation of whatever is writing logs is not compatible with your filebeat configuration.
These can have duplicate lines (identical) in a file. when i check in kibana, it should be appear as 3 records. (3 hits)
but when i send this to kibana without using fingerprint, hits are increasing over and over (every refresh it is increasing hits). duplicating same log again and again. (i dont know if it is a filebeat or logstash issue)
when i used fingerprint, then it is appearing only 2hits (duplicated line is not appearing).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.