Filebeat filestream input not releasing file handler with hard-linked file

Hello,

I have a use case where I ingest files created by the Wazuh agent, these files are created in paths with the following format:

/var/ossec/logs/archives/YYYY/MM/ossec-archive-dd.json

For example:

/var/ossec/logs/archives/2022/05/ossec-archive-05.json

Those files are hard-linked to the file /var/ossec/logs/archives/archives.json, so it is pointing to the same inode and filebeat is configured to read the /var/ossec/logs/archives/archives.json file.

Sometimes the source file needs to be rotated because its size, so I will have something like this:

/var/ossec/logs/archives/2022/05/ossec-archive-05.json
/var/ossec/logs/archives/2022/05/ossec-archive-05-001.json
/var/ossec/logs/archives/2022/05/ossec-archive-05-XXX.json

The new rotated file is then hard-linked to the /var/ossec/logs/archives/archives.json and filebeat can read it without any problems.

The issue is:

After the original file is rotated and has not been updated for a while, filebeat is not releasing the file handler, I have a background process to compress the ossec-archive-XX.json files because of disk space issues, but since I check if the file is being used with lsof and filebeat is not releasing it, the files don't get compressed.

This issue did not happen when using the log input, they started happening when we updated to use the filestream input.

The documentation says that the default value for close.on_state_change.inactive is 5 minutes, so I would expect that filebeat closed the file handler for inactive files after 5 minutes without any update, this is not happening, I also tried to explicitly set this in the filebeat.yml, but it also didn't work, I can only release the file handlers if I restart filebeat.

Is there any other config that I could tweak to solve this, to make filebeat release the file handler after some time?

This is my current filebeat.yml

filebeat.config.inputs:
  enabled: true
  path: "/etc/filebeat/inputs/*.yml"

setup.ilm.enabled: false
ilm.enabled: false
setup.template.enabled: false

queue.mem:
  events: 8000
  flush.min_events: 1000
  flush.timeout: 1s

output.elasticsearch:
  hosts: '${ES_HOT_NODES}'
  loadbalance: true
  worker: 2
  bulk_max_size: 500
  compression_level: 5
  username: '${ES_USERNAME}'
  password: '${ES_PASSWORD}'
  ssl.certificate_authorities: ["/etc/filebeat/config/certs/ca.crt"]

http.enabled: true
http.port: 5067
monitoring.enabled: false
monitoring.cluster_uuid: '${ES_MONITORING_UUID}'

And this is the input wazuh.yml

- type: filestream
  paths:
    - /var/ossec/logs/archives/archives.json
  fields:
    index_prefix: index-name
  pipeline: ingest-pipeline-name

I'm running 7.16.3, an update is planned but will not happen now.

I've changed the filebeat path to look to the original file instead of the hard-linked but it still not releasing the file handler.

[root@REDACTED May]# ls -larth
total 63G
drwxr-x---. 3 ossec ossec  17 May  2 14:59 ..
-rw-r-----. 2 ossec ossec   0 May 17 00:00 ossec-archive-17.log
-rw-r-----. 1 ossec ossec 41G May 17 13:38 ossec-archive-17-001.json
drwxr-x---. 2 ossec ossec 100 May 17 13:38 .
-rw-r-----. 2 ossec ossec 23G May 17 15:42 ossec-archive-17-002.json

The file ossec-archive-17-001.json was last updated at 13:38, when the process rotated and started to write on ossec-archive-17-002.json.

I have close.on_state_change.inactive: 30m on my filebeat.yml, the file is not being written anymore, so I would expect that filebeat would release the handler on the file, but this is not happening.

[root@REDACTED May]# ls -l ossec-archive-17-001.json 
-rw-r-----. 1 ossec ossec 42951750148 May 17 13:38 ossec-archive-17-001.json
[root@REDACTED May]# date
Tue May 17 15:56:48 UTC 2022
[root@REDACTED May]# lsof ossec-archive-17-001.json 
COMMAND    PID USER   FD   TYPE DEVICE    SIZE/OFF      NODE NAME
filebeat 17382 root   10r   REG   8,17 42951750148 134408332 ossec-archive-17-001.json

More than 2 hours after the file was last written, filebeat still didn't released the file handler, so my compress process can not start.

This didn't happen when I was using the log input, only after I started to use the new filestream input.

Anyone has any idea how to solve this?

Well, I've Just opened a support ticket, if anyone from elastic wants to check this out, the case number is #00958544.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.