Hi,
I've been wrestling with my Aggregate plugin configuration and it seems like it interacts poorly with the Logstash micro-batching framework in my case.
I have log lines similar to this (simplified):
2018-03-21 10:05:24,305 Start task Task1
2018-03-21 10:05:24,306 Finish task
2018-03-21 10:05:24,307 Start task Task2
2018-03-21 10:05:24,308 Finish task
My logstash config file (filter section) looks roughly like this:
grok {
match => { "message" => "^%{TIMESTAMP_ISO8601:local_timestamp}\s+%{GREEDYDATA:message}" }
overwrite => "message"
}
grok {
match => { "message" => "^Start task %{GREEDYDATA:task_id}" }
}
if "_grokparsefailure" not in [tags] {
aggregate {
task_id => "%{source}" # Note the use of a global task_id
code => "map[:task_id] = event.get('task_id')"
}
}
if [message] == "Finish task" {
aggregate {
task_id => "%{source}"
code => "event.set('task_id', map[:task_id])"
}
}
Now my problem from is that this configuration file seems to work is:
- First grok filter (extract message) is applied to a batch of 125 items
- Second grok filter (find 'Start task') is applied to the batch
- Thirdly, the first aggregate filter is applied to same batch
- Finally, the second aggregate filter is applied, again to all of the batch
This batching of events per-filter gives the effect of items being seen out-of-order, similar to what happens if there are multiple pipeline workers (I only have one). In the example, both 'Finish' events receive the task_id of 'Task2'.
I have found two different solutions/workarounds to my issue:
- Set pipeline.batch.size to 1 in logstash.yml to force events to get treated one at a time
- Wrap the entire filter section in a giant 'if [message] =~ /^/', which seems to force the logstash internals to treat the entire block as one serialized filter
This gives rise to my questions:
- Am I missing something obvious?
- Are there any other ways around the issue?
- What is the likely performance impact of changing the batch size?
- Is the workaround of a global 'if' statement likely to stop working at some point?
- Should the aggregate plugin documentation mention this limitation, similar to the way that there is a prominent warning about only having 1 worker?
Thanks