Logstash Sincedb duplicate entries

justin_sch · November 1, 2023, 10:14am

Hey,

I'm using the ELK-Stack to analyze a Log-File.
Right now I clone the Log-File via SSH onto my local machine via a bash script every hour.
The Logfile gets data appended every minute.
This is my conf:

input {
  file {
    path => "Log.log"
    start_position => "end"
    sincedb_path => "etc/logstash/sincedb"
  }
}

filter {
  grok {
    match => {"Filtering"}
  }
  
  if "_grokparsefailure" in [tags] {
    drop { }
  }  
  date {
     match => [ "timestamp", "yyyy-MM-dd HH:mm:ss" ]
     target => "@timestamp"
  }  

  mutate {
    remove_field => ["message"] # Remove the 'message' and 'timestamp' fields if needed
  }
}

output {
  stdout { codec => rubydebug }
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "index1"
  }
}

The issue is that even though im using sincedb every hour when the log-file gets updated logstash starts reading it from the front again which leads to duplicates in my elasticsearch index.

Had a look at the sincedb file:

2063229 0 66305 34537335 1698822168.460246 Log1.log

2063229 0 66305 848366127 1698830975.2680178 Log1.log

As you can see it seems like the data of the logfile just gets added up in sincedb (?)

Does anyone of you have an idea what could be the issue?

Badger · November 1, 2023, 5:46pm

You are creating a new file each time you clone the log. The sincedb values (here and here) show a different minor device number. I guess you are working on Windows, so those are actually nFileIndexHigh/nFileIndexLow rather than major/minor device numbers.

system · November 29, 2023, 5:46pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Duplicated log messages Logstash	1	1327	January 8, 2018
Logstash parsing same contents again and again Logstash	11	552	April 29, 2018
Not able to use since_db properly Logstash	2	270	July 30, 2019
Duplicate data parsed by Logstash, which cause duplicate data in Elasticsearch index Logstash	5	1621	October 6, 2017
Duplicate Entries of Log data Elasticsearch	6	4757	September 29, 2017

Logstash Sincedb duplicate entries

Related topics