We are using the logstash-kafka input with this configuration:
kafka {
#Kafka Consumer Group ("logstash" is default, specifying for clarity)
group_id => "logstash"
white_list => "topic1,topic2,topic3,...,topicN"
zk_connect => "connection-info"
codec => "line"
}
I know that it is important to maintain a balance between consumer threads and partitions. How does that affect our ability to scale via adding logstash instances?? For example if we are consuming from one topic with a partition of 1, we can't actually scale up simply by adding a logstash instance, because more consumer threads than partitions = idle threads.
Likewise if we start to add more topics to our whitelist, we can end up with more partitions than consumers. Can logstash-kafka be configured such that it scales easily without losing consumer-partition balance??