[SOLVED] Filebeat keeps open files forever


(Hamelg) #1

Here is our setup :

filebeat:
  prospectors:
    -
      paths:
        - "/var/log/apache2/*access.log"
      document_type: accesslog
      ignore_older: 5m
      tail_files: true

Every day, logrotate rotates these files.

/var/log/apache2/*.log {
        daily
        rotate 7
        compress
        notifempty
        create 644 root adm
        sharedscripts
        postrotate
                if [ -f '/var/run/apache2.pid' ]; then
                      /etc/init.d/apache2 reload > /dev/null
                fi
        endscript
}

Sometimes (pretty rarely : about once on ten), filebeat keeps old files opened forever. lsof shows us that filebeat is at EOF, but for some unknown reasons it doesn't close the file.

Documentation says about force_close_files :

By default, Filebeat keeps the files that it’s reading open until the timespan specified by ignore_older has elapsed. This behaviour can cause issues when a file is removed.

is there a known issue about that ? What kinds of issue can happen ?

If filebeat reaches EOF on a deleted file, why it keeps it opened ?
I am worry about using force_close_files because I could lose some logs.


(ruflin) #2

Filebeat should latest close a file after ignore_older, in your case 5 minutes. With the next major release we will introduce close_older as ignore_older was having to different purposes which could have conflicted: https://github.com/elastic/beats/pull/718

The file is kept open as filebeat doesn't know, if the file only was renamed and it will continue reading, if the file was actually deleted or it just disappeared for a moment.

About force_close_files: The important part is in the second part of the documentation. If you use force_close_files rotated files are only picked up again after scan_frequency. If the file was removed in the meantime and not all log lines were read before rotating, these log lines are lost.

Which filebeat version are you using?


(Hamelg) #3

we use filebeat 1.0.1
I understand that we have no other choice to use force_close_files until v2.0 .
am i correct ?
thank you


(ruflin) #4

As you use ignore_older: 5m, is already "locking" the file for 5 minutes an issue?


(Hamelg) #5

I'm not sure to understand how it works.
We have changed the setup, now we have :

filebeat:
  prospectors:
    -
      paths:
        - "/var/log/apache2/*.log"
      document_type: accesslog
      force_close_files: true
      tail_files: true

files are rotated at 06:00am :
xxx.log is rename to xxx.log.1
xxx.log.1 is compressed to xxx.log.1.gz
xxx.log.1 is deleted
apache writes now logs to xxx.log

Filebeat has still the previous log file open xxx.log.1 ???
why, it should be closed, because force_close_file is enabled ?
I must wait 24H (default value for ignore_older) before filebeat closes the deleted file ?

# lsof |grep delete|grep beat
....
filebeat 27131 root 20r REG 252,2 2906279 136754 /var/log/apache2/xxx.log.1 (deleted)

in the syslog, we see these errors :
Jan 27 08:13:29 /usr/bin/filebeat[27131]: log.go:94: File reading error. Stopping harvester. Error: Force closing file: /var/log/apache2/xxx.log


(Hamelg) #6

After playing with ignore_older and force_close_files, we notice some issues caused by file rotation :

  • filebeat lets files open forever. With the time, these files eat free space on the filesystem.
  • when ignore_older timer has elapsed without modifications, filebeat ignores definitively the file . When this happen, we must restart filebeat.

It seems these different timers management to poll file/directory modifications are subject to race condition issues.
have you considered to use the inotify API to improve filebeat reliability ?

At now, we are going to try increase scan_frequency timer and see if it fixes our issues.


(Steffen Siering) #7

@hamelg that's the reason close_older was introduced and ignore_older is set to infinity by default. See post by @ruflin


(Hamelg) #8

So until v2.0, the workaround is to restart filebeat at regular time to free deleted files.
I don't see other solution.
How can I test close_older feature, is there a v2.0 snapshot preview somewhere ?
thank you


(Steffen Siering) #9

you can try the nightlies: https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/


(Hamelg) #10

close_older has solved the issue :relaxed:
when close_older will be present, v1.2 or v2 ?

Thank you


(ruflin) #11

It will be for sure in 2.0. We are probably not going to backport it to 2.0. In case you run with the master build, please let us know in case you have any issues.


(iyaozhen) #12

I think it is maybe a bug. @ruflin

My paths conf is /home/work/IP/*/log/ModuleCall.log.*. This 11 module create log very fast. Write file in current hour log, example ModuleCall.log.2016032913. And every hour del before day log example ModuleCall.log.2016032813, But filebeat still open this file:
lr-x------ 1 work work 64 3月 29 12:58 96 -> /home/work/IP/adapter/log/ModuleCall.log.2016032813 (deleted)
Disk space seemingly don't freed. Disk used fast grow to 100%.

My full configure:
filebeat 1.1.2

paths:
  - /home/work/IP/*/log/ModuleCall.log.*
      encoding: utf-8
      fields:
        app_name: channel_log
      fields_under_root: true
      ignore_older: 2h
      scan_frequency: 20s
      harvester_buffer_size: 65536
      tail_files: false
      backoff: 3s
      force_close_files: true

It is a very big problem for me.

When I kill filebeat:

Please forgive my poor english. Thx everyone.


(Moshe Kaplan) #13

Hi,

We have filebeat installed on Amazon AMI w/ log rotated deployed and automatic deletion of old files.
We faced the forvever opened and deleted files and tried to implement the close_older as described in https://www.elastic.co/guide/en/beats/filebeat/1.2/configuration-filebeat-options.html (seems it was impelemented before 2.0)

Yet, when we restarted the service, the configuration did not do much difference (in the end we implemented a cron w/ hourly restart)

We'll be glad to know if there is more elegant way to implement it.

filebeat --version
filebeat version 1.2.2 (amd64)

/etc/filebeat/filebeat.yml

filebeat:
 prospectors:
  -
    paths:
     - /var/app/current/logs/*.log
    input_type: log
    document_type: app_log
    close_older: 1m
 registry_file: /var/lib/filebeat/registry
output:
 logstash:
  hosts: ["xxx.xxx.xxx.xxx:5044"]
shipper:
logging:
 files:
  rotateeverybytes: 10485760 # = 10MB

(ruflin) #14

Your configuration looks good. Does it keep all files open or only some specific ones?


(Moshe Kaplan) #15

All the files (we did not keep it for a long time (we kept it for 15 min), but it opened all the log files, did not close any of them though only one file was active, and deleted files were remained opened.
We restarted the daemon several times


(Mohit Sarhadi) #16

Above resolution does not solve the issue.
I am using latest version of filebeat.
Yaml file on windows.
force_close_files: true
ignore_older: 1m

Files are not rotated.
Everytime i have to restart FileBeat service.
Any resolution ?


(Steffen Siering) #17

see sample config given. Use close_older. No need to enbale force_close_files or set ignore_older.


(Mohit Sarhadi) #18

Thanks :slight_smile:


(Ori Rubinfeld) #19

There is a bug with filebeat 1.2.2, had the same issue.
upgraded to filebeat 1.2.3


(David Radcliffe) #20

Using close_older with 1.2.3 still isn't working for me.