We're in the process of optimizing our Filebeat processes and we're looking for a general guideline of how to do so.
We have multiple HAproxy servers which write logs via rsyslog. Each event is written in JSON and its size is 1.5K to 2.0K. File can have between 6M - 10M events.
We are sending these events to Kafka.
Currently, we set our Filebeat to 4 workers with a bulk_max_size of 4098.
We're looking for ways to benchmark and/or best practices for Kafka.