Filebeat high availability, avoiding duplicated results

yago82 · February 3, 2023, 8:32am

Dear Elastic Community,

I am looking for a solution to retrieve logs from multiple servers and I have considered using Filebeat for this purpose.

My main concern is to ensure high availability, avoiding duplicated results. Is it possible to have multiple Filebeats to cover for each other in case one of them fails, ensuring that no log is missed or duplicated? If so, how can the secondary instance know where the primary instance left off?

Thank you in advance

Ayush_Mathur · February 3, 2023, 11:19am

Hello @yago82 , I believe this is not possible with filebeat since it essentially needs a registry file to keep the last offset of log files read stored on a persistent location. You can have 2 beats instances running on same host, reading same log files, but you cannot configure them to read same registry, there will be a lock exception thrown right at startup. In case the registries are different, the whole purpose here is defeated.
That's why filebeat and other log shippers are deployed as DaemonSet to ensure atleast one pod running in the environment on every node.

yago82 · February 3, 2023, 1:58pm

Hi @Ayush_Mathur ,

Is it possible to place the registry file on a shared network location such that if the first server goes down, the Filebeat installed on a second server can read the registry file from the shared location and continue collecting logs from where the Filebeat on the first server left off?

Thank you

Ayush_Mathur · February 3, 2023, 2:03pm

It is indeed possible, but your initial issue was about 2 filebeat instances already running. In your current situation, you will have to closely monitor the filebeat processes to start the second as soon as first stops.
May I know the basic reason behind your idea of having filebeat HA for each server ? It would be hard to manage your beat instances if you have multiple nodes/ servers in your environment as you'll be maintaining 2 filebeat instances per server (1 for local log files and 1 for remote log files for HA).
If you main reason is resource, may be try running 2 filebeat instances on same server (ofcourse having separate registry locations) but reading different log files.

yago82 · February 13, 2023, 12:17pm

Hi @Ayush_Mathur,

I apologize Ayush, first of all, I thank you for the time you're dedicating, I'll try to explain myself better and be as clear as possible. Anyway, everything is currently in the theoretical phase so there is a bit of confusion.

We aim to establish a highly available logging solution by setting up two separate servers, each equipped with its own filebeat instance. The second server will serve as a backup and will only be activated in case the first server experiences downtime. Both filebeats will obtain their registry information from a shared directory. The objective of this setup is to ensure that logs are immediately ingested into Elastic without any disruptions or delays caused by application failures.

Thank you
Regards

leandrojmp · February 13, 2023, 1:53pm

You can't share the same registry file with two running instances, if you want to share the registry file one of the filebeat instances can not be running and can only start after you can guarantee that the other instance is not running.

Also, what is your source of data? If you are reading from a file is this file local or on a shared network?

The way you will implement HA depends on how you are collecting your data, in my experience it makes no sense to think in terms of HA for filebeat if you are using it to read logs.

yago82 · February 13, 2023, 2:00pm

Hi @leandrojmp,

The second filebeat would only be activated in the event that the first server goes down, so it respects the condition that the two filebeats are not active simultaneously.
The source of the data is a CIFS share.
Thank you for the advice.

Thank you

system · March 13, 2023, 4:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
File beat High Availability to avoid data loss and avoid duplicates records Beats filebeat	12	647	February 5, 2024
Filebeat HA servers Beats filebeat	5	783	December 30, 2020
Using single filebeat installation for multiple servers Beats	1	1380	June 26, 2017
Filebeats is re-processing logs once it restarts Beats filebeat	6	4627	April 18, 2018
2 instance filebeat reading the same file Beats filebeat	4	1535	September 1, 2020

Filebeat high availability, avoiding duplicated results

Related topics