Controlling dropping of multiple Logstash instances when using with Kafka

Hi everyone, I am new to Elastic, apologies if this is something obvious.

I have multiple Logstash instances getting input from MSK Kafka, so as to lower the load per Logstash instance.

In each Logstash instance, I used filter to limit the number of beats got sent to Elasticsearch. For example, if I want 10 beat files in the Elasticsearch and I am running 2 Logstash instances, I will limit each Logstash instance to process only 5 beat files.

My question is, how do I make sure beat files will only be processed by Logstash instance that is not already saturated? Continuing the example above, if I have 9 beat files, and all these just happen to be processed by one same instance, then the 4 beat files will be lost?

I think I must have understood Logstash mechanism wrongly somehow... can someone help, please? Thank you!

Hi,

If you have set up your Logstash instances to be in the same consumer group, Kafka should automatically distribute the beats files evenly across the instances. This means that if one Logstash instance is already processing the maximum number of beats files, Kafka should send the next beats file to the other Logstash instance.

Regards