Performance hit when multiple filebeats are sending to same ES

BenB196 · June 19, 2023, 11:37am

Hmm, this is even weirder, I would've expected to see higher throughput rates at this point.

A few more ideas:

What happens if you disable replicas on the index?
- Note: This isn't something you should do for production, but I'm curious if it alters your indexing rate.
In your filebeat.yml when happens if you use the following settings:

queue.mem.events: 65536
queue.mem.flush.min_events: 2048

output.elasticsearch.bulk_max_size: 500
output.elasticsearch.worker: 16

Note: You might want to mess around with the output.elasticsearch.bulk_max_size and output.elasticsearch.worker settings. Every cluster is different when it comes to the ideal values for bulk_max_size and workers.

What happens now if you have 2 or 3 Filebeats pointed at your 3 node cluster?

hjazz6 · June 19, 2023, 3:32pm

I don't have replicas for my filebeat indices. Only the automatically created indices, e.g. .ds-.monitoring-es-*, have replicas.

I currently have the following settings.

queue.mem.events: 64000
queue.mem.flush.min_events: 4000

output.elasticsearch.bulk_max_size: 4000
output.elasticsearch.worker: 8

I've tried 12 and 16 workers, but didn't see any difference. I'll play around with the settings again.

From the filebeat logs, it looks like the queue gets filled after about 2 minutes, stays full for about 2-3 minutes, which is when the packet drop happens, then completely empty (gc maybe?), before getting filled up again.

hjazz6 · June 27, 2023, 1:09am

I've been trying different combinations of queue.mem.events, queue.mem.flush.min_events and output.elasticsearch.bulk_max_size, but have not seen any significant improvement yet.

When going through the elasticsearch logs, I did see a few JvmGcMonitorService warnings on one of the three nodes (the one last added to the cluster). It only appeared about 5 times in the last week, and only on this one node. No searching was done on the ES at the time, only indexing. Not sure if this is relevant.

[INFO ] [o.e.m.j.JvmGcMonitorService] [node-3] [gc] [young] [214426] [26447] duration [854ms], collections [1]/[1.5s], total [854ms]/[7.6m], memory [19.6gb]->[3.1gb]/[30gb], all_pools {[young] [16.5gb]->[16mb]/[0b]}{[old] [3gb]->[3gb]/[30gb]}{[survivor] [81mb]->[64.2mb]/[0b]}
[WARN ] [o.e.m.j.JvmGcMonitorService] [node-3] [gc] [young] [214426] overhead, spent [854ms] collecting in the last [1.5s]

Each node has 30GB of JVM memory. I currently have a total of 2750 shards, but indexing 8 shards at a time (40GB index, max primary shard size set at 5GB).

system · July 25, 2023, 3:10am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
"Wild swings" in event rate when number of workers increase Beats elastic-stack-monitoring , filebeat	1	499	March 16, 2022
Running multiple filebeat instances to handle netflow load Beats filebeat	3	664	November 3, 2021
Performanceissue with Filebeat and Netflow Input Beats beats-module , filebeat	7	1915	November 3, 2021
2 instances of Filebeat on same Linux server output to same ES Beats filebeat	4	271	May 26, 2023
How to determine the bottleneck between Filebeat and ES? Elasticsearch	3	219	June 9, 2023

Performance hit when multiple filebeats are sending to same ES

Related topics