Metricbeat and Elasticsearch Breaker Limit

Hi All.

We have a Kubernetes cluster with Elasticsearch and Metricbeat

We collect quite a lot of data with Metricbeat and sometimes our Elasticsearch cluster might go down during some maintenance and as i understand it, Metricbeat will continue to queue the data in memory until it can reach Elasticsearch again. Sometimes it will actually spool about ~1.4 gigs of data.
When Elasticsearch comes up the bulk data is sent and then we hit the breaker limit.

We dont want to increase the breaker limit, we have tight params on Elasticsearch to keep the overheads low. We set the heap size to 1.5gigs(i know it's very little, we have our reasons :stuck_out_tongue_winking_eye:)

So my questions are focused on the Metricbeat settings. I understand that i can use bulk_max_size but i dont really understand what it means when the docs say

The maximum number of events to bulk in a single Elasticsearch bulk API index request. The default is 50.

(ref)

How does 50 events accumulate to 1.4 gigs? Is it multiple events of size 50 combined to 1.4gigs sent all at once?

Does setting flush.min_events and flush.timeout help with this? (ref)

Another question, if the default pod resource memory limits for Metricbeat is set to 200Mi, how is the metricbeat pod storing event data in memory way past the actual limits?

Your guidance in understanding how this works would be appreciated.

Regards

Michael

Ok, i found the answer to the first question

Metricbeat uses an internal queue to store events before publishing them. The queue is responsible for buffering and combining events into batches that can be consumed by the outputs. The outputs will use bulk operations to send a batch of events in one transaction.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.