Persist Filebeat registry on Kubernetes host

Hello everyone!

We have Filebeat running as a DaemonSet in our Kubernetes cluster, using, for the most part, the yaml provided in the Filebeat documentation.
I am trying to decide whether we should be using the tail_files true or false configuration option. Since our Filebeat input is log files which are rolling every 24 hours, and according to the documentation, it looks like we should be using the tail_files: false option if we don't want to risk losing a few lines of the new log files when rolling happens (we have the scan_frequency: 1s so don't know if that is a possibility).

However, under such a scenario, any Filebeat pod restart or upgrade will destroy the registry file as it resides in the pod itself and, consequently, re-send all the events from the files whose pattern matches the configuration producing duplicates in ElasticSearch. I believe such a scenario is only possible in Kubernetes as there is no "restart" of a pod.

As tail_files: true is not an option either, I was thinking about persisting the Filebeat registry file as a volume in the host VM so that any Filebeat pod restarts, upgrades, crashes (be they manually triggered or by Kubernetes itself) will pick up from the correct offset index. Kind of like a hybrid between the two tail_files states.

Would this be a good approach to our problem?
In theory, using a DaemonSet without a rollingUpdate strategy there should not be more than one Filebeat using registry at a time.
Still, is there a scenario under which we could get a registry corruption or have more than one entity updating it?

Thanks!

Hi @mariomechoulam,

Yes, as you mention it is recommended to persist the registry in the host, we placed a comment about that in the yaml files provided in the repository. With this you wouldn't need to make use of tail_files for this case.

Awesome, that makes it clear.
Thanks Jaime!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.