Performance hit when multiple filebeats are sending to same ES

hjazz6 · June 13, 2023, 2:30am

These are the things I've changed, but there hasn't been any significant improvement. I'm now running a single filebeat instance.

Changed JVM heap size to 30GB
Added another node to the ES cluster (2 physical servers), making it a 2-node cluster (total JVM heap is now 60GB)
2 shards per index
Turned off metricbeat
Changed index rollover size in ILM from 50GB to 40GB

Changing output.elasticsearch.worker in filebeat.yml from 8 to 12 or 16 did not improve the performance, so I left it at 8.

Some small improvements I've observed since the changes.

Usually, the number of received packets in 30 seconds is about 8K (drop about 22K). Now, it's usually about 10K received packets, 20K dropped packets.
When the index rollover size was 50GB, on the Indexing Rate chart on Kibana for the current ES index, the Indexing Rate fluctuates between 20 and 30K/s. After changing the size to 40GB, it fluctuates less, between 26 and 29K/s.

Not sure if this matters, but I also noticed that when the index rollover size was 50GB, the index's merges.total_throttled_time_in_millis can be as high as 50% of merges.total_time_in_millis. I got this by looking at the Stats of the index under Index Management. For example, I would see (for past indices)

"merges": {
    "current": 0,
    "current_docs": 0,
    "current_size_in_bytes": 0,
    "total": 178,
    "total_time_in_millis": 5675026,
    "total_docs": 62632665,
    "total_size_in_bytes": 65673150397,
    "total_stopped_time_in_millis": 13460,
    "total_throttled_time_in_millis": 2712266,
    "total_auto_throttle_in_bytes": 67806790
}

After changing the rollover size to 40GB, this has decreased to about 25-30%, e.g.

"merges": {
    "current": 0,
    "current_docs": 0,
    "current_size_in_bytes": 0,
    "total": 1639,
    "total_time_in_millis": 10781764,
    "total_docs": 158044730,
    "total_size_in_bytes": 171873293917,
    "total_stopped_time_in_millis": 10079,
    "total_throttled_time_in_millis": 3052129,
    "total_auto_throttle_in_bytes": 132788866
}

The number of merges seems to have increased, though.

Something else I observed:

As I mentioned previously, about every 3 minutes, I get zero packet drop and all packets are received for about 2 min, then the packets start dropping again. This usually occurs when memstats.gc_next and memstats.memory_alloc are the lowest compared to a few minutes before and after.

The indexing rate drops to 0 at around the same time the packet drop decreases to 0 (about 30K packets received at this time). Indexing rate stays at 0 for about 20-40 seconds before increasing, while packet drop remains at 0 for about 2 more min.

What else can I try?

Topic		Replies	Views
FileBeat slow - Improve performance Beats filebeat	12	20178	September 20, 2018
Slow indexing speed, possibly related to filebeat misconfiguration Beats docker , filebeat	5	1114	April 7, 2023
Insufficient throughput from Filebeat Beats filebeat	18	10569	July 5, 2017
Scaling Writes/Indexing Elasticsearch	9	1685	July 6, 2017
Advice on Scaling writes Elasticsearch	23	2419	August 17, 2021

Performance hit when multiple filebeats are sending to same ES

Related topics