Aggregate messages in sequence order?

I have vendor appliance that I'm sending syslog data from to logstash v6.5.4. The appliance splits messages that are over 1kB into separate messages and then sends them out via UDP. We had kafka buffering messages before logstash but removed that for troubleshooting.

I was able to get message aggregation to mostly work using the aggregation filter however the way messages are split, it will just cut off field names and continue on in the next message (after the header).

In each message I have a sequence ID (a sequential ID number that gets reset whenever), a segment total and segment number. I've used those 3 numbers to create a task_id that's unique to that message group.

The problem is message can get out of order somehow which causes issues with field names not being recombined right because for example message 5/6 could end like "permiss" and message 6/6 will have (after the header) "ions=...." This would lead to a mess of field names if allowed to continue because the break can come anywhere in a field name depending on the data.

How can I use the aggregation filter (or anything else) to reassemble those messages in the order they should be based off the sequence number?

logstash generally does not preserve the order of messages (it is multithreaded, and different subsets of messages are processed by different threads). You can set pipeline.workers to 1, so that only one CPU is used, and, for now, you will also need to disable the java_execution engine (although a fix for that has been committed).

Thanks Badger. We already set the pipeline.workers=1 and we still were having issues but I don't think we tried disabling the java_execution engine. What does that do and will that have any affect on some Ruby code I'm using to calculate the task_id?

The java_execution engine was introduced a couple of years back to replace the ruby execution engine. There is a bug that causes it to re-order events.

Just wanted to follow up that we solved this by setting the listening node to 1 worker. This is in addition to setting the pipeline.workers to 1.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.