Which input logstash plugin is the fastest?

111126 · October 10, 2016, 9:42am

Hello everybody.

I need hight event rate to logstash - at least 50k, and I must also have a way of further scaling
How I can forward 50k, 100k, 150k events per second?
I have central rsyslog server that stores logs of all my devices, and I need forward logs to logstash.
I tried to use syslog input plugin, results 3,5k event rate
unix input plugin - 11k rate
tcp plugin - about 50k rate

For test i use simple logstash config, like this:

   input {
            
            }
    }
    filter {
                    metrics {
                            meter => "in_events"
                            add_tag => "metrics"
                    }
    }

    output {
            if "metrics" in [tags] {
                    file {
                            path => "/var/log/logstash/logmeter"
                            codec => line { format => "in_ rate_1m: %{[in_events][rate_1m]}   out_rate_1m: %{[out_events][rate_1m]}"}
                         }
            }

Tell me how events per second are in the productive you do?

Christian_Dahlqvist · October 10, 2016, 9:58am

Performance will depend on the amount of processing you do on the events as well as the throughput supported by downstream systems, not just the throughput input plugin. What kind of processing will be doing on your data? Where will you be sending it?

111126 · October 10, 2016, 10:09am

I know that perfomance depend at logstash filters and logstash output. Because I remove all my filters, and set output to file. But if I view <50k events on test config, that when I enable my logstash filters (grok, aggregate and etc) I have even less events rate. I think, for high perfomance I must use logstash (w\o filters) -> redis -> logstash (with filters) -> elastic

Christian_Dahlqvist · October 10, 2016, 10:34am

OK, so the Logstash config you are referring to is basically a collector that does minimal processing and enqueues data in Redis. This is generally a good architecture as you can have multiple Logstash indexers reading from it and allows you to scale out horizontally. It will also allow buffering if the indexing is not able to keep up, which will reduce the risk of losing data at the source.

You will however probably also need to be able to scale out the collection layer horizontally. Most input plugins can be tuned quite a bit, so it would be useful to see the configurations you have used to reach the results mentioned.

111126 · October 10, 2016, 10:50am

I use the following options:

syslog {
port => 8000
type => log
}

tcp {
type => log
port => 8000
}

unix {
path => "/tmp/socket"
}

My test server have the next config:

centos 6.8
-8Gb ram
-8 processor Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz

Christian_Dahlqvist · October 10, 2016, 11:00am

Which version of Logstash are you using? How many concurrent connections are you using when sending data to Logstash?

111126 · October 10, 2016, 11:13am

I use latest stable verison logstash-2.4.0
In production I must use one connections. Because I have one central log serve. But in my test I tried to forward log from several connections to unix socket, but I have only 11k rate.

111126 · October 10, 2016, 3:24pm

I tried use filebeat. Filebeat forward log to logstash (8 workers). I put 4 log files /tmp/file1.log /tmp/file2.log /tmp/file3.log /tmp/file4.log
filebeat config:
filebeat:
prospectors:
paths: /tmp/*.log
input_type: log
registry_file: /var/lib/filebeat/registry
output:
logstash:
hosts: ["x.x.x.x:8000"]
worker: 8
bulk_max_size: 10000

other values filebeat config is default/

And I have the next results in_ rate_1m: 3408
I think is very bad perfomance

Christian_Dahlqvist · October 13, 2016, 5:15am

If the events are similar size as in the previous examples, I would agree 3408 events per second is not very good. What does your beats input config look like?

111126 · October 13, 2016, 5:57am

My filebeat input config:

filebeat:
prospectors:
-
paths:
- /tmp/*.log
input_type: log
scan_frequency: 0s
harvester_buffer_size: 32768
spool_size: 81920

Also, I tried the following scheme:
syslog-ng forward log to redis and logstash input log from redis. The result is a large flow log from syslog-ng to redis, volume of redis is grows up very fast, but logstash can input only 9k rate per second.
Logstash input redis config:

redis {
host => "x.x.x.x"
data_type => "list"
type => log
key => "pp_rtest"
threads => 8
}

Christian_Dahlqvist · October 13, 2016, 6:02am

I have not tuned the reds input in some time, but recall seeing the batch size parameter having an impact on performance. Try gradually increasing this to see how that impacts throughput.

111126 · October 13, 2016, 6:37am

Perhaps batch size take affect the performance, but I doubt I can get 50k event rate per sec.
I think there should be a global solution. Do you agree?

Christian_Dahlqvist · October 13, 2016, 6:45am

Increasing batch size should give you a significant boost in throughput. To get to the optimal performance for your use case you may need to be methodical and benchmark a few different combinations of worker threads (Logstash filter workers as well as input and output workers). I do not know what the limit it.

At some point you will need to scale out though, and Redis (or another message queue) will allow you to have a number of Logstash instances reading off the same queue. The limit to most ingest pipelines I see is however usually the actual processing of the events or the throughput of the outputs.

Topic		Replies	Views
Logstash throughput with metrics plugin Logstash	1	763	July 6, 2017
Logstash input/ouput elasticsearch plugin capped performances Logstash	6	308	June 19, 2021
Aggregate plugin perfomance Logstash	5	971	July 6, 2017
Syslog output plugin send rate limit Logstash	1	479	December 5, 2018
Faster speed when indexing log files Logstash	11	5164	July 6, 2017

Which input logstash plugin is the fastest?

Related topics