Beats behavior when output host is unreachable

Hi,
I Wanted to know the filebeat and metricbeat behavior when all hosts specified on ouput.logstash / output.elastic are unreachable.
Does the beat saves data it collected over the course of time the hosts were unreachable and send it once the hosts are reachable again? does filebeat harvesters saves only the last read curser and stop reading from the file?

Filebeat stops collecting data from inputs when it cannot forward the events to the output. This usually leads to issues on the host, e.g. Filebeat not closing removed files as it has not finished reading the files. Filebeat guarantees at least once delivery, so your data won't get lost.

@shaunak Could you please give more info about Metricbeat?

All Beats buffer events in an internal queue before sending them to the output. If the output goes away, the events will keep buffering in the internal queue up to a limit. The limit depends on the type of queue.

There are two types of queues in Beats: in-memory or on-disk. For in-memory queues the limit is governed by the queue.mem.events setting. This setting defines the capacity of the queue in terms of number of events it can hold. For on-disk queues the limit is governed by the queue.spool.file.size setting. This setting defines the capacity of the queue in terms of total size (bytes) of events in can hold.

If the output goes away, the queue will get filled to its capacity and then stop accepting new events from queue producers.

In the case of Filebeat the queue producers are the inputs. As @kvch explained, the log input will keep track of how far it has read from the source file and resume processing from there once the queue has capacity available. So no data will be lost.

In the case of Metricbeat the queue producers are metricsets. They will keep enqueuing events onto the queue until it has reached capacity. After that point, though, they will drop any additional events on the floor. This is because, unlike log files, there's no good way to keep track of a "read location" for metrics. The next best approximation of this concept is a queue, which we already have. So if you want to avoid dropping metrics while an output goes away for a long period of time, make sure to set your queue capacity (via either of the settings mentioned above) to something sufficiently large .

Hope that helps,

Shaunak

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.