Filebeat: Persist processed logs if docker container crashes

I'm running Filebeat on several docker containers using this Dockerfile: https://github.com/primait/docker-filebeat/blob/master/1.2/Dockerfile

When the container for some reason crashes or goes down I need to make sure that the processed logs up to that point are not reprocessed again, leading to a big chunk of duplicated logs on the ELK. I need to avoid this.

From this page directory-layout I get that the data folder (path.data) is where filebeat keeps the processed logs registered. Am I right?

My filebeat.yml has this entry:
file:
path: "/etc/filebeat"
path.data: "/etc/filebeat/data"

But when I enter the docker container and navigate to /etc/filebeat/data I see an empty folder and I'm 100% sure Filebeat is processing logs because I see them live on Kibana.

Can someone help me out here?
Basically my intention is to prevent process the same log entries multiple times in case the docker container crashes.

I shut down the docker container with Filebeat. On Kibana a selected a time window with logs and then I restarted the Filebeat container several times and on the Kibana dashboard for that exact time window I see no duplicated entries neither an increase of logs.

This leaves me wondering why there's no duplicated log entries for that time window as I expected Filebeat to reprocess the logs every time it starts (the container is recreated).

Am I missing something here?

My coworker said he suspects that Filebeat reprocesses the log every time it starts (the container is recreated) and asked me to investigate a way to fix this (he's currently out of office).

From this page directory-layout I get that the data folder (path.data) is where filebeat keeps the processed logs registered. Am I right?

Yes: Configure general settings | Filebeat Reference [master] | Elastic

But when I enter the docker container and navigate to /etc/filebeat/data I see an empty folder and I'm 100% sure Filebeat is processing logs because I see them live on Kibana.

That's a bit surprising. Filebeat does log the location of the registry file so I'd check in the log. You might have to bump the log level.

This leaves me wondering why there's no duplicated log entries for that time window as I expected Filebeat to reprocess the logs every time it starts (the container is recreated).

Is the registry file stored in a persistent volume? It should be, but if it's not and therefore lost when the container is recreated I'd also expect any remaining logs to be shipped again.

Alternatively instead of running filebeat in each container, you could run one container with filebeat and mount the docker logs volumes into the container for processing. Liek this filebeat keeps running if one of your containers stops.

Thanks for your help guys.
This is exactly what I was looking for: https://www.elastic.co/guide/en/beats/filebeat/1.2/configuration-filebeat-options.html#_registry_file

This topic was automatically closed after 21 days. New replies are no longer allowed.