Filebeat 5.0beta1 with kafka output: multiplication of log lines

Digging into the code, this can indeed be a problem. Doc says message is dropped, but code returns an error sarama.ErrMessageSizeTooLarge. While sarama would only drop this message, libbeat aggregates all errors for a batch, potentially forcing kafka to resend the complete batch.

@psychonaut maybe you can create an github issue about batches being resend if some event in the batch is too big. I'd really like to mark the issue for the Pioneer Program.

I will prepare a fix on master for just dropping too large messages with an error messsage (output can not tell which fields need to be shortened). Fix should be to adapt sizes in kafka itself + in filebeat.yml kafka output section and/or multiline. This is unfortunately fully up to the user. As these big events are normally traces, no one wants to drop them, but kafka settings can force producers to do so.
I'm thinking about adding a dead-letter queue for a while (e.g. for this use-case), but definitely not in 5.0 release.