Performance hit when multiple filebeats are sending to same ES

I've been trying different combinations of queue.mem.events, queue.mem.flush.min_events and output.elasticsearch.bulk_max_size, but have not seen any significant improvement yet.

When going through the elasticsearch logs, I did see a few JvmGcMonitorService warnings on one of the three nodes (the one last added to the cluster). It only appeared about 5 times in the last week, and only on this one node. No searching was done on the ES at the time, only indexing. Not sure if this is relevant.

[INFO ] [o.e.m.j.JvmGcMonitorService] [node-3] [gc] [young] [214426] [26447] duration [854ms], collections [1]/[1.5s], total [854ms]/[7.6m], memory [19.6gb]->[3.1gb]/[30gb], all_pools {[young] [16.5gb]->[16mb]/[0b]}{[old] [3gb]->[3gb]/[30gb]}{[survivor] [81mb]->[64.2mb]/[0b]}
[WARN ] [o.e.m.j.JvmGcMonitorService] [node-3] [gc] [young] [214426] overhead, spent [854ms] collecting in the last [1.5s]

Each node has 30GB of JVM memory. I currently have a total of 2750 shards, but indexing 8 shards at a time (40GB index, max primary shard size set at 5GB).