Logstash Kafka consumer count

According to the Logstash guide:

"How many partitions should I use per topic?"

At least the number of Logstash nodes multiplied by consumer threads per node.

Better yet, use a multiple of the above number. Increasing the number of partitions for an existing topic is extremely complicated. Partitions have a very low overhead. Using 5 to 10 times the number of partitions suggested by the first point is generally fine, so long as the overall partition count does not exceed 2000.

(Emphasis mine.)

Why is it better to have more partitions per consumer, rather than just 1-to-1?

Is there an efficiency benefit? A redundancy benefit?

And are there costs, too, like could there be time-spent-iterating-over-consumers growing with consumer count, or maybe contention at some point, or maybe it costs in Kafka memory overhead to manage lots of consumers ... or?

The guide makes no sense to me as-is. I suspect that original-brownbear meant "extremely cheap" rather than "extremely complicated". karenzone just copied the FAQ that original-brownbear wrote to summarize questions he was repeatedly being asked.

If your threads and partitions are numbered in 2 digits I would not worry about it. If you have thousands then benchmark it.

A larger number of partitions will result in a smoother distribution of partitions across threads.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.