How can i send duplicate lines (from logstash to elasticsearch) in log file

I'm a beginner of this area and i want to send duplicate lines of a log file to elastic search. currently it is identifying duplicates and send only one line,

sample log file

00:00:59,62.10.38.47,39473311,"POST /v0.1/test9/889 HTTP/1.1",202
00:00:59,62.10.38.47,39473311,"POST /v0.1/test9/889 HTTP/1.1",202
00:00:59,62.10.38.47,39473311,"POST /v0.1/test9/889 HTTP/1.1",201

#what i expect is, i need to send all these 3 lines to elastic search (currently only 2 lines send)

#logstash configuration file

input {
beats {
port => 5044
}
}

filter {
mutate {
split => {"message" => ","}
add_field => {
"msg_actual_time" => "%{+YYYY-MM-dd} %{[message][0]}"
}
}

date {
match => ["msg_actual_time","YYYY-MM-dd HH:mm:ss"]
timezone => "Asia/Colombo"
locale => "en"
target => "@timestamp"
}

fingerprint {
method => "SHA1"
key => "KEY"
}

}

output {
elasticsearch {
hosts => "elasticsearch:9200"
document_id => "%{fingerprint}"
user => "elastic"
password => "changeme"
index => "doc-log-%{+YYYY.MM.dd}"

}
stdout { codec => rubydebug }
}

I have tried to add a line number field in logstash,but it is generating same number for the duplicate records

Could you please help to resolve this.
Thank you.

By default, elasticsearch will generate a unique document_id for each event. You are forcing it to use a document_id based on the value of the message field. If two events contain the same value of [message] they will get the same document_id.

If you want es to include duplicate messages then remove the fingerprint filter and do not set the document_id option.

Thank you for the reply, and when i'm removing fingerprint filter, the log is duplicating over and over (every refresh it is continue increasing the duplicates of the log file )
Thats why i have put a fingerprint filter here.

Please help to resolve this.
Is there a way to generate uniq id field and bind to fingerprint for this ?

Thank you.

In that case I honestly do not understand what you are asking for. You insert a fingerprint filter to eliminate duplicates, then ask how to make sure that duplicates are retained.

What do you mean by "every refresh" and what does the filebeat configuration look like? In fact this may be a filebeat question rather than a logstash question. It may be that the log creation/rotation of whatever is writing logs is not compatible with your filebeat configuration.

Actually what need to do is , i have a log files as below,

00:00:59,62.10.38.47,39473311,"POST /v0.1/test9/889 HTTP/1.1",202
00:00:59,62.10.38.47,39473311,"POST /v0.1/test9/889 HTTP/1.1",202
00:00:59,62.10.38.47,39473311,"POST /v0.1/test9/889 HTTP/1.1",201

These can have duplicate lines (identical) in a file. when i check in kibana, it should be appear as 3 records. (3 hits)

but when i send this to kibana without using fingerprint, hits are increasing over and over (every refresh it is increasing hits). duplicating same log again and again. (i dont know if it is a filebeat or logstash issue)

when i used fingerprint, then it is appearing only 2hits (duplicated line is not appearing).

filebeat.yml

filebeat.inputs:

  • type: log
    enabled: true
    paths:
    • /home/test/Documents/logs/*
      tags: ["docs1"]
      fields:
      fields_under_root: true

output.logstash:
hosts: ["localhost:5044"]

Please check this and help to resolve this.
Thank you.

This is fixed. :slight_smile: Issue was the version (seems it is a bug of 7.5.2). Then i moved to 7.6.1 (for filebeat,logstash,elastic) , issue is resolved.

Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.