Performance Tips For Logstash And Monitoring Its Queue

Nitz · January 19, 2017, 9:51am

Hi,

We have logs infrastructure, built in this form:

Filebeat agents consume logs and output them to one topic in Kafka. We are talking about ~50k messages per sec. This part of the architecture works great.
We have 50 partitions for this topic, which means we have 50 Logstash instances, each instance has one consumer thread.
Our Logstash configuration is pretty simple, without many GROKs. Configuration will be added as a comment for this thread (both logstash.conf and logstash.yml)
Also - We used at first the default queue (in-memory), it caused to massive lags on Kafka (millions!). We changed the queue to disk based (persistent), and now we have no lags at Kafka at all - but we fear that the lags are on the Logstash queue now.
We output everything to ES, splitting all the logs into 4 indices.
Our ES structure - 2 client nodes (8 cores + 32GB mem), 3 data master nodes (8 cores + 32GB mem).
ES indexes docs at the rate of ~30k per sec, ~10k per each data node. At this point, the clients collapsing because it reached the heap size (85%+).
We also made some tests and found out the Elasticsearch is our bottleneck.

Now, I have a few questions I would like to ask:

How can I monitor my disk based queue on Logstash? I need to be sure we are not accumulating a lag inside Logstash which can cause the disk to blow up (and probably to various other failures).
Why outputting to ES is so slow, and how can I make this process be faster? We tried to change number of output workers but it didn't improve anything.
For dealing with an aprrox amount of 50k+ msgs per sec, what is your suggested ES cluster size?

Thank You!

Nitzan

Nitz · January 19, 2017, 9:53am

logstash.yml

path.data: /var/lib/logstash
pipeline.batch.size: 1000
path.config: /etc/logstash
config.reload.automatic: false
config.debug: false
queue.type: persisted
queue.max_bytes: 8gb
log.level: warn
path.logs: /var/log/logstash

Nitz · January 19, 2017, 9:55am

logstash.conf

input {
kafka {
bootstrap_servers => "KAFKA_IP:9092"
group_id => "test"
topics => ["test"]
auto_offset_reset => "latest"
consumer_threads => 1
decorate_events => true
codec => "json"
}
}
filter {
if [type] == "stdout" or [type] == "stderr" {
grok {
match => { "source" => "/logs/%{WORD}/%{DATA:agent_id}/frameworks/%{DATA:framwork_id}/executors/%{GREEDYDATA:service}._%{GREEDYDATA:exec}/runs/%{DATA:run_id}/%{WORD:log_type}" }
}
if ("action_time" in [message])
{
json { source => "message" }
mutate { rename => { "action_time" => "action_timestamp" } }
}
else if ("time:" in [message])
{
json { source => "message" }
mutate { rename => { "time" => "action_timestamp" } }
}
else {
grok {
match => { "message" => ["%{DATE1:action_timestamp}","%{DATE2:action_timestamp}"," %{DATE1:action_timestamp}","%{WORD} %{DATE2:action_timestamp}","%{WORD} [%{DATE2:action_timestamp}]","%{WORD} %{TIMESTAMP_ISO8601:action_timestamp}"," ERRO: %{DATE1:action_timestamp}"," INFO: %{DATE1:action_timestamp}"]}
}
if "_grokparsefailure" in [tags] {
ruby {
code => 'event.set("action_timestamp",event.get("@timestamp"))'
}
}
}
}
}
output {
elasticsearch {
hosts => ["ES_IPs"] index => "%{type}-%{+xxxx.ww}"
}
}

system · February 16, 2017, 9:55am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kafka ingest performance issues Logstash	6	1929	September 3, 2019
Kafka Input Performance Problems Logstash	1	450	May 15, 2020
Huge delay in logs to ES from Kafka/Logstash Elasticsearch	5	2137	August 4, 2020
Finding bottleneck in pipeline Logstash	9	1530	March 1, 2022
Boost throughput of Kafka input Logstash	6	1595	February 24, 2021

Performance Tips For Logstash And Monitoring Its Queue

Related topics