Performance hit when multiple filebeats are sending to same ES

I don't have replicas for my filebeat indices. Only the automatically created indices, e.g. .ds-.monitoring-es-*, have replicas.

I currently have the following settings.

queue.mem.events: 64000
queue.mem.flush.min_events: 4000

output.elasticsearch.bulk_max_size: 4000
output.elasticsearch.worker: 8

I've tried 12 and 16 workers, but didn't see any difference. I'll play around with the settings again.

From the filebeat logs, it looks like the queue gets filled after about 2 minutes, stays full for about 2-3 minutes, which is when the packet drop happens, then completely empty (gc maybe?), before getting filled up again.