Performance hit when multiple filebeats are sending to same ES

These are the things I've changed, but there hasn't been any significant improvement. I'm now running a single filebeat instance.

  1. Changed JVM heap size to 30GB
  2. Added another node to the ES cluster (2 physical servers), making it a 2-node cluster (total JVM heap is now 60GB)
  3. 2 shards per index
  4. Turned off metricbeat
  5. Changed index rollover size in ILM from 50GB to 40GB

Changing output.elasticsearch.worker in filebeat.yml from 8 to 12 or 16 did not improve the performance, so I left it at 8.

Some small improvements I've observed since the changes.

  1. Usually, the number of received packets in 30 seconds is about 8K (drop about 22K). Now, it's usually about 10K received packets, 20K dropped packets.
  2. When the index rollover size was 50GB, on the Indexing Rate chart on Kibana for the current ES index, the Indexing Rate fluctuates between 20 and 30K/s. After changing the size to 40GB, it fluctuates less, between 26 and 29K/s.

Not sure if this matters, but I also noticed that when the index rollover size was 50GB, the index's merges.total_throttled_time_in_millis can be as high as 50% of merges.total_time_in_millis. I got this by looking at the Stats of the index under Index Management. For example, I would see (for past indices)

"merges": {
    "current": 0,
    "current_docs": 0,
    "current_size_in_bytes": 0,
    "total": 178,
    "total_time_in_millis": 5675026,
    "total_docs": 62632665,
    "total_size_in_bytes": 65673150397,
    "total_stopped_time_in_millis": 13460,
    "total_throttled_time_in_millis": 2712266,
    "total_auto_throttle_in_bytes": 67806790
}

After changing the rollover size to 40GB, this has decreased to about 25-30%, e.g.

"merges": {
    "current": 0,
    "current_docs": 0,
    "current_size_in_bytes": 0,
    "total": 1639,
    "total_time_in_millis": 10781764,
    "total_docs": 158044730,
    "total_size_in_bytes": 171873293917,
    "total_stopped_time_in_millis": 10079,
    "total_throttled_time_in_millis": 3052129,
    "total_auto_throttle_in_bytes": 132788866
}

The number of merges seems to have increased, though.

Something else I observed:

As I mentioned previously, about every 3 minutes, I get zero packet drop and all packets are received for about 2 min, then the packets start dropping again. This usually occurs when memstats.gc_next and memstats.memory_alloc are the lowest compared to a few minutes before and after.

The indexing rate drops to 0 at around the same time the packet drop decreases to 0 (about 30K packets received at this time). Indexing rate stays at 0 for about 20-40 seconds before increasing, while packet drop remains at 0 for about 2 more min.

What else can I try?