TL;DR: for no apparent reason, logstash will take as low as 0.24ms to drop events, or up to 56ms (a 200x difference!).
I've been trying to track down a performance issue with our logstash setup. We're getting huge backlogs, and it may have to do with the amount of data we're putting into it. We're filtering out a lot of events but still getting poor performance, so last night I started to look at the pipeline performance and see the following for a drop filter:
To play around with the setup and test out locally on my development box, I duplicated a similar pipeline, injecting randomly generated events (some of which get filtered) and got the following:
Not great performance but similar to what I'm seeing in prod.
After slight changes to the filter (using else if
instead of if
one after the other), suddenly I got a drastic performance increase:
Then, to verify that this was the cause of the change, I reverted to the original structure, but the performance increase stayed (verification failed):
At this point I left it for the night.
This morning, I decided to, without restarting anything, re-run the exact same test I did last night. And got a 200x performance difference, with the pipeline viewer reporting it was now taking 50ms/event to DROP messages:
This doesn't make any sense to me and makes me feel that something wrong is happening inside of logstash. Help!
environment:
- logstash 6.0.1
- es 6.0.0
- kibana 6.0.0
- ubuntu 16.04.3
- docker 17.12.0-ce
logstash filter:
filter {
mutate {
convert => [
"status", "integer",
"intensity", "integer"
]
}
# test out the patterns we're using in prod
if [type] == "rails" {
if [status] == 200 {
if [path] =~ /^(\/[^\/]+)?\/magic\/string/ {
drop { }
}
if [path] =~ /^(\/[^\/]+)?\/secret\/sauce/ {
drop { }
}
}
}
}