We have 4 logstash servers currently just taking ~4000/sec. Works fine almost all the time, but few times couple server have UDP drops.
At the same time, i see following exception logged.
syslog tcp output exception: closing, reconnecting and resending event {:host=>"xxx.xxx.xxx", :port=>514, :exception=>#<SystemCallError: Unknown error (SystemCallError) - No message available>
Trying to find, if the syslog tcp output exception is making the packets to pile up and cause UDP drops or is it other way around?
Logstash config
4 servers - 4CPU,8 GB
workers = >8
queue_size => 50000
pipeline.workers => 8
batch_size => 2000