I develop a system with Apache Kafka and Logstash And ElasticSearch. data Produced into Kafka and logstash indexes data into ElasticSearch.
In Kafka side, I have good throughput and Produce 30K message per second,
this Is from Monitoring Kafka:
But on the other side, when I consume data with Logastash, Rate is very smaller than Producer side.
this is Logastash Events Rate:
this is for a kafka topic with 2 partitions (in logstash config, i set consumer_threads => 2)
when I create new topic with 3 partitions and 3 consumer_thread Logstash Rate increases linearly:
And finally for 4 partitions:
It seems some configuration In Logstash causes Events rate limitation
Logstash is limited by the systems it sends data to, so it could also be that Elasticsearch or some other output is limiting performance. What does CPU usage look like on the Elasticsearch and Logstash nodes? What does disk I/O look like on the Elasticsearch nodes? What is the specification of your Elasticsearch cluster? How many indices/shards are you actively indexing into?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.