Filebeats parsing docker logs with max-size limit

ml_docker_captain · December 25, 2019, 12:03pm

Hy, I have few services that create a lot of logs and I want to start using ES + logfile to parse them. Now, since once logs are parsed and send to ES I don't need them(and since I'm forced to use on premise server with limited hard disk it shuts down server and application is turned off - which can't happen! when docker logs fill all of serves hard disk). I limit docker logs in odcker compose to 50MB( max-size: "50m"). My question is will this distort filebeats is harvesting interval is small enough? In other words, will I miss any logs in ES because of limiting docker logs size?

Thans for any help and tips, great community!

shaunak · December 26, 2019, 2:37pm

When your container's log file crosses 50 MB in size, it will get truncated. As with any truncation scenario, log lines that Filebeat hasn't "seen" yet might be lost. Your best bet would bet to set backoff to something very low to ensure that Filebeat aggressively tries to look for new log lines, which increases its chances of staying caught up before truncation occurs.

Of course, once the file is truncated, Filebeat will detect this and start from the beginning of the file.

ml_docker_captain · December 26, 2019, 8:52pm

Thanks for info! But lets say if I limit to 50MB I will always have logs for at least last 5 minutes(log quantity is very predictable) and if I set backoff to 1s...will I get duplicates in ES? Duplication is not the exact term...what I basicly mean is multiplicarion of same log because log file will be truncated and filebeat will pick it up. Because if thats true I am afraid I can't use it. Where can I read more about it(for person not familiar with go language)?

shaunak · December 27, 2019, 12:56am

You should not get duplicates due to truncation.

Filebeat internally maintains a "byte offset" of where to start reading from in a file. Initially that is set to 0. As Filebeat consumes log lines from the file, that byte offset gets incremented. When a file is truncated, Filebeat detects that scenario and resets the byte offset to 0.

With a sufficiently low (i.e. aggressive) backoff Filebeat will be reading log lines as quickly as it can, and incrementing this internal byte offset. Ideally, most of the time the byte offset will be at EOF, that is Filebeat is completely caught up. When the file gets truncated, the byte offset will be reset to 0 and Filebeat will start reading log lines from the top of the file again.

So I'm not sure how we would end up with duplicates in ES, but perhaps I'm missing something so please feel free to point it out!

Hope this helps,

Shaunak

system · January 24, 2020, 12:56am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat logs are too verbose Beats filebeat	2	712	April 13, 2020
Filebeat/Logstash output split messages into multiple with approximately a maximum field size of 8191 charactes Beats docker , filebeat	1	871	May 16, 2023
Filebeat and bufferring Beats	3	3296	July 10, 2016
File beat stop harvesting logs from few files suddenly , works fine after restart Beats filebeat	2	425	May 23, 2019
Filebeat does not harvest all logs from application Beats docker , filebeat	1	667	April 18, 2022

Filebeats parsing docker logs with max-size limit

Related topics