Filebeat fills up the disk very quickly and write log repeatedly

In our environment, filebeat has a lot of this error (file name is replaced):

DEBUG [input] log/input.go:261 State for file not removed because harvester not finished: /var/log/xxx.log
DEBUG [input] log/input.go:251 Remove state of file as its identity has changed: /var/log/xxx.log

And when the error occurred, a large number of file descriptors were occupied (the disk was filled very quickly, and the filebeat was stopped and restored), and a large number of duplicate logs were seen on the ES.

I suspect it may be related to file rotation.

This is my configuration file (part of the content is replaced)

filebeat.inputs:
-type: log
  enabled: true
  close_inactive: 10s
  scan_frequency: 1s
  paths:
    -/var/log/xxx.log
  fields:
    type: "xxx"

processors:
  -add_host_metadata:
      when.not.contains.tags: forwarded
  -add_locale: ~
  -add_fields:
      target: host
      fields:
        ip: ["192.168.2.65"]

output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["192.168.2.253:9200"]
  indices:
    -index: "filebeat-xxx-%{+yyy.MM.dd}"
  pipelines:
    -pipeline: "xxx"
  username: "username"
  password: "password"

The version of filebeat is filebeat version 7.9.2 (amd64), libbeat 7.9.2 [2ab907f5ccecf9fd82fe37105082e89fd871f684 built 2020-09-22 23:19:45 +0000 UTC]

The above is what I saw when I opened all the logs. When an error occurs, the normal situation will look like this:

2020-10-16T21:14:24.933+0800 INFO log/harvester.go:299 Harvester started for file: /var/log/xxx.log
2020-10-16T21:14:25.952+0800 INFO log/harvester.go:299 Harvester started for file: /var/log/xxx.log
2020-10-16T21:14:26.969+0800 INFO log/harvester.go:299 Harvester started for file: /var/log/xxx.log
2020-10-16T21:14:27.980+0800 INFO log/harvester.go:299 Harvester started for file: /var/log/xxx.log
2020-10-16T21:14:28.986+0800 INFO log/harvester.go:299 Harvester started for file: /var/log/xxx.log
2020-10-16T21:14:29.988+0800 INFO log/harvester.go:299 Harvester started for file: /var/log/xxx.log
2020-10-16T21:14:30.994+0800 INFO log/harvester.go:299 Harvester started for file: /var/log/xxx.log

What log rotation strategy are you using? Are the files overwritten or renamed/removed?

Was there any error in the output when the problem occurred?

Our log rotation strategy is: if the log file exceeds the size limit, close the current file and rename it, and then open a new file.

when the problem occurred, if I use filebeat -e to run filebeat, logs like this:

2020-10-16T21:14:17.862+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:18.865+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:19.879+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:20.884+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:21.899+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:22.912+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:23.921+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:24.933+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:25.952+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:26.969+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:27.980+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:28.986+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:29.988+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log
2020-10-16T21:14:30.994+0800    INFO    log/harvester.go:299    Harvester started for file: /var/log/daemon.log

when I use filebeat -e -d '*' to run filebeat, there will be tons of logs( may be 1gb per minute),except for the transmitted json log, the rest is like this:

2020-10-17T08:05:45.114+0800    DEBUG   [registrar]     registrar/registrar.go:201      Registry file updated. 44 active states.
2020-10-17T08:05:45.114+0800    DEBUG   [registrar]     registrar/registrar.go:231      Registrar state updates processed. Count: 50
2020-10-17T08:05:45.114+0800    DEBUG   [registrar]     registrar/registrar.go:264      Processing 50 events
2020-10-17T08:05:45.113+0800    DEBUG   [publisher]     memqueue/ackloop.go:131 ackloop:  done send ack
2020-10-17T08:05:45.113+0800    DEBUG   [publisher]     memqueue/ackloop.go:128 ackloop: return ack to broker loop:50
2020-10-17T08:05:45.113+0800    DEBUG   [acker] beater/acker.go:59      stateful ack    {"count": 50}
2020-10-17T08:05:45.113+0800    DEBUG   [publisher]     memqueue/eventloop.go:535       broker ACK events: count=50, start-seq=1333773, end-seq=1333822
2020-10-17T08:05:45.113+0800    DEBUG   [publisher]     memqueue/ackloop.go:160 ackloop: receive ack [32491: 0, 50]
2020-10-17T08:05:45.112+0800    DEBUG   [elasticsearch] elasticsearch/client.go:229     PublishEvents: 50 events have been published to elasticsearch in 25.314964ms.
2020-10-17T08:05:45.089+0800    DEBUG   [registrar]     registrar/registrar.go:201      Registry file updated. 44 active states.
2020-10-17T08:05:45.089+0800    DEBUG   [registrar]     registrar/registrar.go:231      Registrar state updates processed. Count: 50
2020-10-17T08:05:45.089+0800    DEBUG   [registrar]     registrar/registrar.go:264      Processing 50 events
2020-10-17T08:05:45.087+0800    DEBUG   [publisher]     memqueue/ackloop.go:131 ackloop:  done send ack
2020-10-17T08:05:45.087+0800    DEBUG   [publisher]     memqueue/ackloop.go:128 ackloop: return ack to broker loop:50
2020-10-17T08:05:45.087+0800    DEBUG   [acker] beater/acker.go:59      stateful ack    {"count": 50}
2020-10-17T08:05:45.087+0800    DEBUG   [publisher]     memqueue/eventloop.go:535       broker ACK events: count=50, start-seq=1333723, end-seq=1333772
2020-10-17T08:05:45.087+0800    DEBUG   [publisher]     memqueue/ackloop.go:160 ackloop: receive ack [32490: 0, 50]
2020-10-17T08:05:45.087+0800    DEBUG   [elasticsearch] elasticsearch/client.go:229     PublishEvents: 50 events have been published to elasticsearch in 15.761547ms.
2020-10-17T08:05:45.073+0800    DEBUG   [registrar]     registrar/registrar.go:201      Registry file updated. 44 active states.
2020-10-17T08:05:45.073+0800    DEBUG   [registrar]     registrar/registrar.go:231      Registrar state updates processed. Count: 50
2020-10-17T08:05:45.073+0800    DEBUG   [registrar]     registrar/registrar.go:264      Processing 50 events
2020-10-17T08:05:45.071+0800    DEBUG   [publisher]     memqueue/ackloop.go:131 ackloop:  done send ack
2020-10-17T08:05:45.071+0800    DEBUG   [publisher]     memqueue/ackloop.go:128 ackloop: return ack to broker loop:50
2020-10-17T08:05:45.071+0800    DEBUG   [acker] beater/acker.go:59      stateful ack    {"count": 50}
2020-10-17T08:05:45.071+0800    DEBUG   [publisher]     memqueue/eventloop.go:535       broker ACK events: count=50, start-seq=1333673, end-seq=1333722
2020-10-17T08:05:45.071+0800    DEBUG   [publisher]     memqueue/ackloop.go:160 ackloop: receive ack [32489: 0, 50]
2020-10-17T08:05:45.068+0800    DEBUG   [elasticsearch] elasticsearch/client.go:229     PublishEvents: 50 events have been published to elasticsearch in 20.835026ms.
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.065+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.064+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.064+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.064+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.064+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.064+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log
2020-10-17T08:05:45.064+0800    DEBUG   [input] log/input.go:251        Remove state of file as its identity has changed: /var/log/daemon.log
2020-10-17T08:05:45.064+0800    DEBUG   [input] log/input.go:261        State for file not removed because harvester not finished: /var/log/daemon.log

I think you are impacted by this recently reported problem: https://github.com/elastic/beats/pull/21911

Also, please check if there is any error in the logs of Elasticsearch.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.