We run filebeat (9.4.2) in our kubernetes clusters, mostly using the autodiscoverfeature and annotations. We have a lot of cronjobs and sidecar containers as well.. from a logging perspective filebeat is working great but the problem is that over time the memory usage only increases and eventually constantly hitting OOM.. The only way to clear the memory usage is to "reset" the data registry on each host and restart, a normal restart does not help. Of course the downside to all this is that all active log files get resent creating a ton of duplicate logs in our logging system so this is really a last resort.
From my understanding filebeat has built in registry cleanup and a few other settings - however i have not seen any impact by explicity setting these (examples clean_inactive, clean_removed)
The attached graph simply shows the point we I reset the registry manually, usage dropped from 640MiB to 80 MiB - current usage is about 120Mib (after 4 days) and so this trend will continue.
/var/lib/filebeat-data/registry/filebeat# ls -lah log.json
-rw------- 1 root root 5.5M Jun 22 08:32 log.json
/var/lib/filebeat-data/registry/filebeat# wc log.json
26198 26198 5705884 log.json
Current filebeat config follows,
logging.level: error
logging.to_stderr: true
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
hints.enabled: true
hints.default_config:
enabled: false
type: filestream
id: container-${data.kubernetes.container.id}
prospector.scanner.symlinks: true
prospector.scanner.fingerprint:
enabled: false
offset: 0
length: 256
parsers:
- container: ~
- multiline:
type: pattern
pattern: '^[[:space:]]'
negate: false
match: after
paths:
- /var/log/containers/*${data.kubernetes.container.id}.log
Any thoughts ideas welcome to bring the running memory down without resetting the data folders?
Hi @ryd-devops
There is one weird thing in your configuration: You disabled fingerprint generation, but did not change the file identity.
May I ask what is the goal/reason? We highly recommend using fingerprint as the file identity in K8s to avoid issues with inode reuse.
When you set
prospector.scanner.fingerprint:
enabled: false
This only disables generating the fingerprint, but it does not change the file identity.
If you made those changes trying to reduce memory consumption I recommend reverting the back to the default ones.
Do you have a rough number of active pods/number of files Filebeat is ingesting?
Do you have a chart showing this increase in memory? Is it a steady slow increase? Does it jump when a 'large' number of new pods are started?
I see you're using hints with default config, are you adding hints annotations to some containers or are all containers running with the default configuration?
hi @TiagoQueiroz thanks for the reply and the useful information provided.
Regarding the missing identify, i have reverted as you suggested - the original intension was to forward logs from sidecar pods that are only very short - the default fingerprint size was not picking them up. Nonetheless i will move the forwarding of these specific logs to a new section if the memory stablises.
The one cluster is about 200 pods (with the co.elastic.logs/enabled: true annotation used by auto discovery to forward logs - we dont forward all pod logs only annotated ones)
The memory growth history is shown in the attached graph - the drop is where i finally removed the data for the specific node. Funny enough it has been relatively stable since then. I also cant explain why 1 node was more affected than others.
So in summary, i have reverted to a more default config (shown below) as suggested and will monitor
logging.level: error
logging.to_stderr: true
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
hints.enabled: true
hints.default_config:
enabled: false
type: filestream
id: container-${data.kubernetes.container.id}
prospector.scanner:
symlinks: true
parsers:
- container: ~
- multiline:
type: pattern
pattern: '^[[:space:]]'
negate: false
match: after
paths:
- /var/log/containers/*${data.kubernetes.container.id}.log
Thanks again