The beats input plugin uses a circuit breaker closing connections if the input plugin can not push events to the pipeline. The default timeout of the circuit breaker is 5 seconds. In addition beats might break connections and resend if logstash is unresponsive for N seconds (default = 30 seconds, I think).
Related config options:
- congestion_threshold in logstash
- timeout in filebeat
- (optional) bulk_max_size in filebeat. Reducing bulk size has little effect in logstash, but ACKs might be returned earlier from logstash reducing the chance of timeouts in filebeat.
I'd recommend to set congestion_threshold to X (very large number) years, in order to disable the circuit breaker + set timeout in filebeat to some higher acceptable value. e.g. at least twice the max timeout in logstash outputs times per event processing overhead, given the problem is not slow filters (e.g. 120 seconds). Monitor filebeat logs (info level) or logstash regarding reconnects and update timeout in beats accordingly.
The root cause is most likely due to output not being very responsive/slow or some logstash filter stalls/slowdowns (e.g. inefficient grok filter).