Logstash output: shrinkWindow is unexpectedly called twice?


(Spacewander) #1

Hey guys:

I have just read the code in:

, and find something strange here:

Once c.publishWindowed returns an err, the send window will be shrinked by this line:

However, there will be an asynchronous call in:


, which also shrinks the window.

Hence, if sending data to logstash failed, the window will be shrinked twice, one is synchronous and the other is asynchronous.

Are there any mistakes in my derivation? Is the behaviour expected or just a bug?


(Steffen Siering) #2

Have you got some logs confirming you observation? run beats in foreground with -e -d 'logstash' and kill running logstash.

The window is supposed to shrink with every batch(window) that has failed. Originally, a batch not being send by publishWindowed, will not be handled by async handler. That is (*msgReg).fail() was not be executed. But this changed at some point. I think you're right, shrinkWindow is executed twice.

Good news is, with beats input being rewritten recently, I'm planning to remove the windowing support. The windowing was used to figure an ok-batch size so to not having to reconnect every so often or continuously fail with logstash being overloaded. The rewrite sports a keep-alive signal, making the dynamic windowing in output plugin superfluous.


(Spacewander) #3

Thank you for your reply.
The way to reproduce the problem:

  1. Build the filebeat from master branch.

  2. Modify filebeat.yml with

    output.logstash:
      # The Logstash hosts
      hosts: ["localhost:5044"]
      loadbalance: true
      pipelining: 1
      bulk_max_size: 2
    
  3. Then run logstash, and also filebeat

  4. Kill the logstash

  5. Get log like this:

2016/09/01 02:13:47.070020 async.go:138: DBG  Try to publish 2 events to logstash with window size 10
2016/09/01 02:13:47.657564 async.go:114: DBG  2 events out of 2 events sent to logstash. Continue sending
2016/09/01 02:13:47.657596 async.go:138: DBG  Try to publish 2 events to logstash with window size 2
2016/09/01 02:13:47.660661 async.go:114: DBG  0 events out of 2 events sent to logstash. Continue sending
2016/09/01 02:13:47.660685 async.go:68: DBG  close connection

It seems that window size shrinks twice, from 10 to 5, and then from 5 to 2, between two publishing.


(Steffen Siering) #4

Nice. You're totally right!!! Great you found this!

Also the eventsNotAcked counter might be wrong, as batch is reported to be failed twice. Once during error handling and once by callback. Shrinking window + updating counter should only happen in (*msgRef).dec(). Also (*msgRef).fail() is not reporting the failed send. Maybe the log message should be moved too.

You want to create a PR fixing this bug (I see you signed CA already)? Given you did dig through async IO-code finding it, I'd rather see your name next to the fix then mine. Honor to whom honor is due :wink:


(system) #5

This topic was automatically closed after 21 days. New replies are no longer allowed.