Filebeat: Persist processed logs if docker container crashes


(pedro) #1

I'm running Filebeat on several docker containers using this Dockerfile: https://github.com/primait/docker-filebeat/blob/master/1.2/Dockerfile

When the container for some reason crashes or goes down I need to make sure that the processed logs up to that point are not reprocessed again, leading to a big chunk of duplicated logs on the ELK. I need to avoid this.

From this page directory-layout I get that the data folder (path.data) is where filebeat keeps the processed logs registered. Am I right?

My filebeat.yml has this entry:
file:
path: "/etc/filebeat"
path.data: "/etc/filebeat/data"

But when I enter the docker container and navigate to /etc/filebeat/data I see an empty folder and I'm 100% sure Filebeat is processing logs because I see them live on Kibana.

Can someone help me out here?
Basically my intention is to prevent process the same log entries multiple times in case the docker container crashes.


Filebeat with Docker mounted logs handle restart duplicates
(pedro) #2

I shut down the docker container with Filebeat. On Kibana a selected a time window with logs and then I restarted the Filebeat container several times and on the Kibana dashboard for that exact time window I see no duplicated entries neither an increase of logs.

This leaves me wondering why there's no duplicated log entries for that time window as I expected Filebeat to reprocess the logs every time it starts (the container is recreated).

Am I missing something here?

My coworker said he suspects that Filebeat reprocesses the log every time it starts (the container is recreated) and asked me to investigate a way to fix this (he's currently out of office).


(Magnus Bäck) #3

From this page directory-layout I get that the data folder (path.data) is where filebeat keeps the processed logs registered. Am I right?

Yes: https://www.elastic.co/guide/en/beats/filebeat/master/configuration-global-options.html#_registry_file

But when I enter the docker container and navigate to /etc/filebeat/data I see an empty folder and I'm 100% sure Filebeat is processing logs because I see them live on Kibana.

That's a bit surprising. Filebeat does log the location of the registry file so I'd check in the log. You might have to bump the log level.

This leaves me wondering why there's no duplicated log entries for that time window as I expected Filebeat to reprocess the logs every time it starts (the container is recreated).

Is the registry file stored in a persistent volume? It should be, but if it's not and therefore lost when the container is recreated I'd also expect any remaining logs to be shipped again.


(ruflin) #4

Alternatively instead of running filebeat in each container, you could run one container with filebeat and mount the docker logs volumes into the container for processing. Liek this filebeat keeps running if one of your containers stops.


Collecting logfiles of Docker containers with filebeat running as Docker container
(pedro) #5

Thanks for your help guys.
This is exactly what I was looking for: https://www.elastic.co/guide/en/beats/filebeat/1.2/configuration-filebeat-options.html#_registry_file


(system) #6

This topic was automatically closed after 21 days. New replies are no longer allowed.