Logstash shuts down while events are still being processed

I've encountered a weird case while trying to load events from a DB and forward them to ActiveMQ via the Stomp protocol.

Basically, my config is very simple and looks like the one below. The SELECT returns approximately 25K ids which are then forwarded as messages via the stomp output.

input {
    jdbc {
        jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        jdbc_connection_string => "jdbc:mysql://host:3306/db"
        jdbc_user => "user"
        jdbc_password => "pass"
        jdbc_page_size => 1000
        jdbc_paging_enabled => true
        statement => "SELECT id FROM table WHERE date BETWEEN '2016-01-01' AND '2016-06-01'"
    }
}
filter {
}
output {
    stomp {
        host => "localhost"
        port => 61637
        destination => "my_queue"
    }
}

If the number of returned records is low (<500), all messages get sent. But as I increase the date interval and return more records (~25K), not all make it to the queue.

At first, I thought that ActiveMQ would expire my messages before they were being consumed. So, to rule out any issues on the ActiveMQ side, I developed another Stomp client in Node.js which would also send 25K messages and all messages would make it to the queue.

The next candidate was the stomp output implementation which I thought was faulty somehow. But before diving in that plugin, I decided to prevent Logstash from stopping simply by adding an stdin{} input and see what would happen if logstash kept running.

All of a sudden, miracle, all 25K messages made it to the queue. By observing, the activity on the queue I could see that it took a few seconds for Logstash to send all inflight events, but they eventually all made it, simply by virtue of keeping Logstash up.

Here is an excerpt from the debug log. We can see that when the last event has been handled by the stomp output, Logstash immediately begins the shutdown sequence (i.e. in the same millesecond) which doesn't let much chance to the messages to actually make it to the queue. A excerpt from my logstash log can be seen here.

So my question is: how come Logstash shuts down while there are some inflight events still being filtered/output?

Another possibility would be that the stomp output actually tells Logstash that it has finished its job as soon as @client.send() is called. Since there's no ACK, I'm wondering if the problem could not be solved in the stomp output plugin simply by making sure to get an ACK before returning from the receive function. The related issue can be followed here.

I'm using Logstash 2.3.2 and on 5.0.0-alpha3 I could reproduce the same issue as well.

Thoughts?

For those interested, the solution was simply to override the close method in the stomp output plugin and call disconnect on the stomp connection in order to give a chance to all the messages to get sent before letting Logstash proceed with the shutdown.

The fix has been merged and new 2.0.5 and 3.0.1 releases of the logstash-output-stomp plugin are now available.

In addition, the plugin now also supports:

  • sending events in batches (instead of one by one) and
  • adding headers to outbound messages.