I've done some investigation here, and the EOF
is due to how we handle idle writer
, in this case, the writer is FB. This feature was used to close the idle or latent connection after a delay the default is 15s; this correspond to the delay you see in your log between each event.
But there is a catch if logstash takes to much time to process a batch a client timeout could occur because the input would not read from the socket. It might be related to the size of the batch you currently have configured.
I'll create a PR to fix that problem to make sure that if a timeout occur and logstash is still processing a batch it won't close the connection until the fix is released you can bump the client_inactivity_timeout
to a bigger value like 900 (15 minutes)
No events should be lost in that case since FB will retransmit the complete window, but it can slow down the consumption of events.
A few notes I saw in this issues.
- I would not change the
pipeline.batch.size
, the default is 125, and we have found that is a sweet spot in performance. Using a value of 15000 for the batch mean that you will use a lot more memory than needed. -
congestion_threshold
is deprecated, when you start logstash you should see a warning about that.