Overcome 2600 msg/seconds with Logstash/Filebeat

carrot · May 21, 2016, 3:24pm

I'm currently testing performance of file ingest but I'm not getting much right now. About 2500-2600 msg/secs. filebeat (1.2.3) -> logstash (2.3.2) -> elastic (2.3.3) with default configurations.

I've got a directory where I extract my logs, they get picked up by filebeat which a fairly generic configuration. These files are about 15GB in size, 45 million lines and each events are about 500 bytes.

The server with filebeat is using about 10% CPU (VM with only 1 CPU, just noticed).

Where could be the bottleneck here?

filebeat:
publish_async: true
prospectors:
-
paths:
- "/Volumes/Disk1/spool/*.log"
input_type: custom_ats_2
type: custom_ats_2
document_type: custom_ats_2

  tail_files: false
  close_older: 1h
  tags:
    sourcetype: custom_ats_2
  fields_under_root: true

output:
logstash:
hosts:
- logstash:5044

The container running logstash is fairly powerful... 12 cores, 192GB ram. Process usage about 160% for logstash.

input {
beats {
port => 5044
}
}

filter {
grok {
match => [ "message", "%{NUMBER:timestamp}" ]
}
kv {
source => "message"
include_keys => ["chi", "phn", "shn", "url", "cqhm", "cqhv", "pssc", "ttms", "b", "sssc", "sscl", "cfsc", "pfsc", "crc", "phr", "uas", "range"]
}
date {
match => [ "timestamp", "UNIX" ]
}

metrics {
meter => "events"
add_tag => "metric"
}

mutate
{
remove_field => [ "message", "source", "beat", "beat.hostname","input_type", "count", "offset" ]
remove_tag => ["beat"]
convert => {
"ttms" => "integer"
"b" => "integer"
"sscl" => "integer"
}
}
}

output {
if "metric" in [tags] {
file {
path => "/var/log/logstash/metrics.log"
codec => line {
format => "%{[time]} rate: %{[events][rate_1m]}"
}
}
} else {
elasticsearch {
hosts => ["hpc01","hpc02","hpc03","hpc04"]
}
}
}

Here's the metrics I'm gathering (which my time field doesn't expand, need to fix).
[root@cdn1cdlogp0001 ~]# tail -f /var/log/logstash/metrics.log
%{[time]} rate: 2549.4839094518907
%{[time]} rate: 2553.3790329886356
%{[time]} rate: 2554.6280165500852
%{[time]} rate: 2558.8154491438468
%{[time]} rate: 2561.660632737703
%{[time]} rate: 2564.4222480654926
%{[time]} rate: 2553.2906517247943

When I removed Elastic Search from logstash output, I got better numbers, but logstash process usage rose to 500%.
%{[time]} rate: 4299.910569505572
%{[time]} rate: 4364.3139484306475
%{[time]} rate: 4401.899953858465
%{[time]} rate: 4465.296741188148
%{[time]} rate: 4506.16230143131
%{[time]} rate: 4525.066816024065
%{[time]} rate: 4572.954869346784
%{[time]} rate: 4596.049650849632
%{[time]} rate: 4627.30831486682
%{[time]} rate: 4627.171725551646

A few things made it a little better end to end up to about 5000 events/sec... Of course CPU usage more than doubled everywhere...

Logstash LS_OPTS="-w 12 -b 2000"
Elastic Search output workers => 4

anhlqn · May 21, 2016, 11:55pm

I would say filebeat is the bottleneck. I was able to push 50,000 EPS through Logstash with minimal filter in my performance test. One way to test is to remove all Logstash filter and measure the throughput.

carrot · May 22, 2016, 3:52am

Removing the filter didn't change the numbers with filebeat in the picture...

Although, once I removed FileBeat and used Logstash file input I could get more performance out of it.

I think the biggest bottleneck is the default (1) workers for Elastic Search Logstash output. I'll need to retest with filebeat.

File Beat -> Logstash (kv) -> Elastic Search : 5,000 EPS
Logstash (file input, kv, no output) : 15,600 EPS
Logstash (file input, no kv, no output) : 20,000 EPS
Logstash (file input, kv, es 4 workers)-> Elastic Search : 7,800 EPS
Logstash (file input, kv, es 8 workers)-> Elastic Search : 12,000 EPS

It's getting better... Looks like I'll need to do more testing introducing redis/kafka and having separate logstash for ingest and indexing.

Topic		Replies	Views
Filebeat sending data to Logstash seems too slow Beats filebeat	20	22464	June 1, 2017
Speed limitations of filebeat? Beats filebeat	14	15266	July 5, 2017
Filebeat unable to cope with incoming logs Beats filebeat	7	1560	February 8, 2018
Performance issues with input-beats plugin Logstash	6	5716	July 6, 2017
FileBeat slow - Improve performance Beats filebeat	12	20205	September 20, 2018

Overcome 2600 msg/seconds with Logstash/Filebeat

Related topics