Hi, community!
My ELK stack has some strange behavior.
Recently I found, that every document in ES is indexed multiple times.
Documents are Netflow flows.
They have different _id values, although it is the same document (describing same connection).
I've proved that data is received correct (on udp socket), but bulk requests to ES already sent this document multiple times (in different requests). So the problem is within LS.
After some data fetching and configuration tuning I observed, that document is multiplied 5 times. The data boost appeared after I added additional config files under /etc/logstash/conf.d/, and exactly the same number of configuration files I have there. Just for test pusposes I removed 2 of them, and yes - number of document copies appeared to be 3.
How it may be? Have anyone faced same symptoms?
I can provide any necessary info from Logstash socket, please ask.
OS: Centos 7; Linux 3.10.0-514.2.2.el7.x86_64
LS: "version" : "5.1.2"
One of configuration files:
input {
udp {
port => 2056
codec => netflow { }
}
}
output {
elasticsearch {
pool_max => 1000
pool_max_per_route => 400
manage_template => false
flush_size => 10000
hosts => localhost
index => "netflow-%{+YYYY.MM.dd.HH}"
}
}
Within logstash.yml I changed only: pipeline.batch.size: 20000
Nr. of workers is equal to CPU - 24, output workers = 1
Any ideas?
Thanks,
Dmitry