Tuning Logstash for optimal throughput for ELK pipeline

We have Kafka->ELK pipeline and logstash is one of the key component in between Kafka and Elastic Search. Looks like Logstash is not able to process data at the same rate as injecting into kafka.
We have below logstash config details:

input {
kafka {
bootstrap_servers => "localhost:9092"
consumer_threads => 24
topics => ["test0"]
}
}

filter {
json {
source => "message"
target => "log"
}
}

output {
elasticsearch {
hosts => ["localhost:9200"]
}

Total CPU cores in localhost is 112.

Do we need to make any specific INPUT and OUTPUT config changes to improve Kafka->ELK pipeline throughput ?

Logstash can only process at the speed Elasticsearch is able to consume data. How have you determined Elasticsearch is not the bottleneck?

Thanks for the quick reply Chris.
I am new to ELK pipeline so please expect some basic queries ( although i have done some research ).

You are right, even ES can be bottlenecks, but CPU utilization for ES is ~10% ( 1 Cluster, 5 shards, ES 7.5.1 , pipeline.workers=112, pipeline.batchsize=125, concumer threads=48)

producer`script with 112 threads & 100000000 num-records are working fine at producer side but consumer server with Kafka-ELK components shows very less CPU utilization ( ~20%) with Logstash & ES as ~10% cpu utilization.

So my main concern is to find golden config for logstash and ES for optimal throughput for Kafka-ELK pipeline.

Hi Chris,

Did i confuse you with my question ? Do you need any other details ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.