I'm running into strange performance issues with the input-beats plug-in.
Logstash: 2.1.1 (duplicated on trunk yesterday FWiW)
input-beats: 2.0.3
filebeat: 1.0.1
It seems like throughput it being bottle-necked by something that isn't CPU / IO.
I'm doing my tests with these apache logs from NASA
My configuration is very simple, I will just replace the input with tcp / file / beats for the tests:
input {
beats {
port => 5044
}
}
filter {
metrics {
meter => "events"
add_tag => "metric"
}
}
output {
if "metric" in [tags] {
stdout {
codec => line {
format => "Rate: %{[events][rate_1m]}"
}
}
} else {
null {
workers => 2
}
}
}
Here is the filebeat config for reference:
filebeat:
prospectors:
-
paths:
- /var/tmp/issue/dataz/*
input_type: log
output:
logstash:
hosts: ["localhost:5044"]
shipper:
logging:
files:
rotateeverybytes: 10485760 # = 10MB
Here are my results:
TCP (I just cat the file into nc): 39k/sec
File: 20k/sec
Beats: 3k/sec
The beats thread barely takes any CPU (~20%). I can increase throughput by adding more workers, and enabling load_balancing on the filebeat, but this really only gets me to ~8k/sec. I also tried messing with spool_size, and harvester_buffer_size with not much change.
Am I missing something obvious? If possible, I would like to get ~10k EPS for my current design. I would really like to use the lumberjack protocol instead of TCP. I haven't tested logstash-forwarder yet.
Thanks in advance!