The doc mentioned that consumer thread should be equal to number of partition or divided by number of logstash instances if more than 1 is running.
Just wanted to understand how do i calculate the value of consumer threads when topics_pattern is used? Say 1 topic has 4 partitions and I am expecting 100 topic in topics_pattern then should i keep consumer threads as 400?
Matching the number of threads and partitions matters for small numbers of partitions. If you have 3 partitions and 4 threads then one will be idle. If you have 3 partitions and 2 threads then one will service 2 partitions and do twice as much work as the other.
I cannot see much point in having the number of threads much larger than the number of CPUs that will run them.
Thanks for the reply !!
I understand the idea of consumer threads and workers.. The question was if a topics_pattern covers 100 topics and i use consumer_threads => 1 then will logstash auto spawn 1 consumer thread per topic ? or i need to mention consumer_threads => 100 for 100 topics
(Assuming all topics with single partition)
The number of consumer threads will be equal to consumer_threads. They get spun up when the input is started, and only later do they subscribe to the topics_pattern.
Of course I could be misunderstanding the code, but that is how I interpret it.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.