Logstash Sincedb duplicate entries


I'm using the ELK-Stack to analyze a Log-File.
Right now I clone the Log-File via SSH onto my local machine via a bash script every hour.
The Logfile gets data appended every minute.
This is my conf:

input {
  file {
    path => "Log.log"
    start_position => "end"
    sincedb_path => "etc/logstash/sincedb"

filter {
  grok {
    match => {"Filtering"}
  if "_grokparsefailure" in [tags] {
    drop { }
  date {
     match => [ "timestamp", "yyyy-MM-dd HH:mm:ss" ]
     target => "@timestamp"

  mutate {
    remove_field => ["message"] # Remove the 'message' and 'timestamp' fields if needed

output {
  stdout { codec => rubydebug }
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "index1"

The issue is that even though im using sincedb every hour when the log-file gets updated logstash starts reading it from the front again which leads to duplicates in my elasticsearch index.

Had a look at the sincedb file:

2063229 0 66305 34537335 1698822168.460246 Log1.log

2063229 0 66305 848366127 1698830975.2680178 Log1.log

As you can see it seems like the data of the logfile just gets added up in sincedb (?)

Does anyone of you have an idea what could be the issue?

You are creating a new file each time you clone the log. The sincedb values (here and here) show a different minor device number. I guess you are working on Windows, so those are actually nFileIndexHigh/nFileIndexLow rather than major/minor device numbers.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.