I develop a system with Apache Kafka and Logstash And ElasticSearch. data Produced into Kafka and logstash indexes data into ElasticSearch.
In Kafka side, I have good throughput and Produce 30K message per second,
this Is from Monitoring Kafka:
But on the other side, when I consume data with Logastash, Rate is very smaller than Producer side.
this is Logastash Events Rate:
this is for a kafka topic with 2 partitions (in logstash config, i set consumer_threads => 2)
when I create new topic with 3 partitions and 3 consumer_thread Logstash Rate increases linearly:
And finally for 4 partitions:
It seems some configuration In Logstash causes Events rate limitation
Logstash is limited by the systems it sends data to, so it could also be that Elasticsearch or some other output is limiting performance. What does CPU usage look like on the Elasticsearch and Logstash nodes? What does disk I/O look like on the Elasticsearch nodes? What is the specification of your Elasticsearch cluster? How many indices/shards are you actively indexing into?
In above test, I have no output for Logstash
this is my config file:
bootstrap_servers => "172.21.21.62:9092,172.21.21.62:9093"
group_id => "logstash"
auto_offset_reset => "latest"
topics => ["elk_1"]
consumer_threads => 4
codec => "json"
so Elasticsearch does not impact rates...
CPU usage is fewer than 20%
I develop a custom Kafka consumer and it's rate is equal to Kafka producer, so kafka does not have limitation ...
this is complete Logstash monitoring metrics:
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.