Filebeat stop to send logs to Elastic after cluster_block_exception occurs

mattia_galati · November 13, 2020, 7:41am

I'm containerizing a monolith application and in each current VM there is a Logstash process which harvests the logs that are being rotating by the application itself, through the python's RotatingFileHandler.

In the current stack, Logstash send the logs to an Elastic index and sometimes happened that the cluster runed out of disk space, but since Logstash retries indefinitely, once we made room for new data, no logs went lost.

Now, we would like to try out Filebeat instead of Logstash and so we decided to try this approach:

a persistent external filesystem
a container instance for our application, which mounts the filesystem to write the rotated log, with a uuid as name (something like 202011_uuid.log.*) so the scaling will cause no conflicts
a single container, for the whole cluster, with Filebeat, which mounts the filesystem, and harvest the logs and send them directly to Elastic.

Now this approach works well until Elastic runs out of space and raises an cluster_block_exception. In this scenario, with the default config, Filebeat should retry indefinitely (https://www.elastic.co/guide/en/beats/filebeat/7.10/elasticsearch-output.html#_max_retries) but actually does only 3 attempts, in around 2 minutes, and then drops the event completely.

Doing some research, I stumbled upon some users that setted the max_retries to -1, but since I cannot be sure to rapidly solve the disk space in production, and don't want Filebeat flood my index, I ended up configuring Filebeat in this way:

output.elasticsearch:
  backoff.init: 20s
  backoff.max: 72h
  max_retries: -1

Now, with the configuration above, seems like Filebeat ignores the retries and does a single attemp.

What I'm doing wrong? Is it a flaw in my stack or some configuration issue?

system · December 11, 2020, 9:41am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Response from logstash to pause filebeat? Beats	2	756	January 31, 2018
Filebeat flooding logstash Beats filebeat	2	491	June 12, 2018
Filebeat Not able to catch up with rotating container logs Beats docker , filebeat	3	1744	March 11, 2022
Target my ES cluster with ~40 FileBeat instances Beats filebeat	6	1051	August 21, 2016
Logs generated while logstash stopped weren't collected after restart Logstash	6	2136	July 10, 2017

Filebeat stop to send logs to Elastic after cluster_block_exception occurs

Related topics