Hey,
I'm using the ELK-Stack to analyze a Log-File.
Right now I clone the Log-File via SSH onto my local machine via a bash script every hour.
The Logfile gets data appended every minute.
This is my conf:
input {
file {
path => "Log.log"
start_position => "end"
sincedb_path => "etc/logstash/sincedb"
}
}
filter {
grok {
match => {"Filtering"}
}
if "_grokparsefailure" in [tags] {
drop { }
}
date {
match => [ "timestamp", "yyyy-MM-dd HH:mm:ss" ]
target => "@timestamp"
}
mutate {
remove_field => ["message"] # Remove the 'message' and 'timestamp' fields if needed
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["http://localhost:9200"]
index => "index1"
}
}
The issue is that even though im using sincedb every hour when the log-file gets updated logstash starts reading it from the front again which leads to duplicates in my elasticsearch index.
Had a look at the sincedb file:
2063229 0 66305 34537335 1698822168.460246 Log1.log
2063229 0 66305 848366127 1698830975.2680178 Log1.log
As you can see it seems like the data of the logfile just gets added up in sincedb (?)
Does anyone of you have an idea what could be the issue?