Hi Team,
I'm seeing numerous duplicates (a lot of them) for each individual doc being sent by logstash. I am still unsure whether this is specific to 4.1.x plugin - I will a test with 4.0.3 and confirm this later.
Below is my input config:
file {
sincedb_path => "/foo/bar/.sincedb-foobar"
max_open_files => 10000
close_older => 60
ignore_older => 1296000 # 15 days
path => "/foo/bar/*.log"
type => "grid_node"
start_position => "beginning"
codec => multiline {
patterns_dir => ["${FOOBAR_PATTERNS}"]
pattern => "%{FOOBAR}"
negate => "true"
what => "previous"
auto_flush_interval => 150
max_lines => 5000
}
}
I suspect the duplication is something to do with the log file being rolled because for other types of logs that do not get rolled I don't see this issue (or maybe not yet):
As shown in the config, logstash watches *.log
files in the /foo/bar folder. Now these *.log
get rolled over after some time ending with .log.1
then .log.2
then .log.x
so on which logstash is not configured to watch.
The duplicate docs all have the path
pointing to the original log file ending with .log
but when I grep for the particular log line then I would find it in the rolled over file ending with .log.x
.
What I don't get is why there so many duplicates, probably hundreds and keeps growing because logstash keeps sending these dupes not-stop, whilst there are only at most .log.x
only goes up to maximum .log.3
.
Cheers,