Aggregate messages in sequence order?

jfro · February 20, 2020, 6:52pm

I have vendor appliance that I'm sending syslog data from to logstash v6.5.4. The appliance splits messages that are over 1kB into separate messages and then sends them out via UDP. We had kafka buffering messages before logstash but removed that for troubleshooting.

I was able to get message aggregation to mostly work using the aggregation filter however the way messages are split, it will just cut off field names and continue on in the next message (after the header).

In each message I have a sequence ID (a sequential ID number that gets reset whenever), a segment total and segment number. I've used those 3 numbers to create a task_id that's unique to that message group.

The problem is message can get out of order somehow which causes issues with field names not being recombined right because for example message 5/6 could end like "permiss" and message 6/6 will have (after the header) "ions=...." This would lead to a mess of field names if allowed to continue because the break can come anywhere in a field name depending on the data.

How can I use the aggregation filter (or anything else) to reassemble those messages in the order they should be based off the sequence number?

Badger · February 21, 2020, 2:34am

logstash generally does not preserve the order of messages (it is multithreaded, and different subsets of messages are processed by different threads). You can set pipeline.workers to 1, so that only one CPU is used, and, for now, you will also need to disable the java_execution engine (although a fix for that has been committed).

jfro · February 21, 2020, 1:09pm

Thanks Badger. We already set the pipeline.workers=1 and we still were having issues but I don't think we tried disabling the java_execution engine. What does that do and will that have any affect on some Ruby code I'm using to calculate the task_id?

Badger · February 21, 2020, 3:31pm

The java_execution engine was introduced a couple of years back to replace the ruby execution engine. There is a bug that causes it to re-order events.

jfro · February 26, 2020, 7:26pm

Just wanted to follow up that we solved this by setting the listening node to 1 worker. This is in addition to setting the pipeline.workers to 1.

system · March 25, 2020, 7:26pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash processing messages out of order Logstash	3	719	October 9, 2019
How to process messages strictly in the order they arrive? Logstash	5	5034	July 6, 2017
Ruby filter processes events in incorrect order Logstash	8	1467	April 2, 2018
Aggregate filter bug with java execution engine Logstash	4	649	July 24, 2019
Order of output fields Logstash	1	696	November 13, 2018

Aggregate messages in sequence order?

Related topics