Impact of group_events on Filebeat and Kafka efficiency

Hello team,
I have a pretty large fleet of Filebeat instances and all of them ship events to a Kafka cluster with multiple brokers which then got consumed by Logstash. I am currently researching on how to configure Filebeat to ship events more efficiently to Kafka.

I was looking at round_robin.group_events since I use round_robin as partitioning strategy with expectation that I would see improvement in terms of efficiency, but I haven't noticed any improvement in that matter when using different values for round_robin.group_events. For example, I've set group_events value up to 5000 on multiple servers where I have Filebeat installed and while shipping ~50000 events per second from those servers to Kafka the performance was quite similar to when I use default group_events value of 1. The metrics that I was monitoring on the Filebeat servers and also the Kafka brokers were number of batches sent to Kafka over a specific period of time, resource utilization (CPU, RAM, Disk I/O), network performance (e.g number of connections, packet size etc.). All metrics seem to be pretty same like if that setting has no effect at all.

In absence of extensive documentation of round_robin.group_events I was wondering if someone has any suggestion on how this setting affects the performance and in which metrics I should expect an improvement?

Does anyone have any experience with similar setup like the one above? Any recommendation on using group_events?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.