Performance issues with input-beats plugin

cooper6581 · December 17, 2015, 10:26pm

I'm running into strange performance issues with the input-beats plug-in.

Logstash: 2.1.1 (duplicated on trunk yesterday FWiW)
input-beats: 2.0.3
filebeat: 1.0.1

It seems like throughput it being bottle-necked by something that isn't CPU / IO.

I'm doing my tests with these apache logs from NASA

My configuration is very simple, I will just replace the input with tcp / file / beats for the tests:

input {
    beats {
        port => 5044
    }
}

filter {
    metrics {
        meter => "events"
        add_tag => "metric"
    }
}

output {
    if "metric" in [tags] {
        stdout {
            codec => line {
                format => "Rate: %{[events][rate_1m]}"
            }
        }
    } else {
        null {
            workers => 2
        }
    }
}

Here is the filebeat config for reference:

filebeat:
  prospectors:
    -
      paths:
        - /var/tmp/issue/dataz/*
      input_type: log

output:
  logstash:
    hosts: ["localhost:5044"]

shipper:

logging:

  files:
    rotateeverybytes: 10485760 # = 10MB

Here are my results:

TCP (I just cat the file into nc): 39k/sec
File: 20k/sec
Beats: 3k/sec

The beats thread barely takes any CPU (~20%). I can increase throughput by adding more workers, and enabling load_balancing on the filebeat, but this really only gets me to ~8k/sec. I also tried messing with spool_size, and harvester_buffer_size with not much change.

Am I missing something obvious? If possible, I would like to get ~10k EPS for my current design. I would really like to use the lumberjack protocol instead of TCP. I haven't tested logstash-forwarder yet.

Thanks in advance!

steffens · December 19, 2015, 4:22am

can you try to increase the bulk_max_size option in filebeat logstash output config?

output:
  logstash:
    bulk_max_size: ...

tudor · December 20, 2015, 11:35pm

What's the CPU usage on the Logstash side? In the tests I'm doing on my laptop (with 2 cores) I also get around 8K/s, but LS seems to push the CPUs to their limits. I'll try tomorrow on a more powerful machine and report back on the results I get.

tudor · December 21, 2015, 5:43pm

I tested on a server with 8 CPU threads, and I could get around 16 K/s by setting bulk_max_size=3000, and all settings set to default. I used the Logstash config that @cooper6581 posted above. I tried playing with other options (number of workers, spooler size, etc.) but nothing seemed to have an impact except for bulk_max_size. Filebeat CPU usage was around 40% and LS was around 150% of a CPU core.

Based on the above experiments, I would guess that the limitation is in the input-beats in Logstash. Compared to the TCP input, for example, the input-beats has to do decompression, JSON decoding (the beats -> logstash protocol is json based) and some light data manipulation.

We'll continue to investigate this, perhaps there are some easy wins. We'll keep you up to date.

cooper6581 · January 6, 2016, 6:05pm

This worked perfect, thanks @tudor and @steffens!

clement · April 4, 2016, 11:13am

Hi

I try to send 20 kB/s from filebeat to logstash and filebeat can not follow the place:

thi is my filebeat config

filebeat:
spool_size: 8192
prospectors:
-
paths:
- D:\zzzzz*.log
document_type: mytype

output:
logstash:
enabled: true
hosts: ["logstash:5043"]
bulk_max_size: 8192

Topic		Replies	Views
Insufficient throughput from Filebeat Beats filebeat	18	10491	July 5, 2017
Filebeat unable to cope with incoming logs Beats filebeat	7	1511	February 8, 2018
Logstash 6.4 use beats input plugin to collect large log file, but too slow Logstash	2	629	March 19, 2019
Filebeat performance tuning Beats filebeat	4	5294	January 20, 2017
Filebeat sending data to Logstash seems too slow Beats filebeat	20	22175	June 1, 2017

Performance issues with input-beats plugin

Related topics