Filebeat I/O Writes Higher Than Expected

Hi,

We're running Filebeat using the Kubernetes autodiscover + Docker input and we're seeing the throughput of writes on certain Filebeat instances being more than I'd expect.
An example instance is harvesting ~13 log files (Rough count using lsof -p 1) however we're seeing write throughput constantly at ~4/5 Mb/s, but occasionally we see it spike for (long-ish) periods upto ~20 Mb/s

One of the log files it's harvesting is a particularly busy log file (It's being harvested, but being dropped by processors), and the instances that have these higher rates are generally harvesting particularly busy logs.

I can see that all the writes are to the registry.new file, and it's being written to about 100-200 times per second.

$ strace -f -e write -p 1 -s 500 -y
[pid    17] write(3</usr/share/filebeat/data/registry.new>, "[{\"source\":\"/var/lib/docker/containers/eb35e9f6f5210b212f35e8fc590abc6265b4ff2a5baf641eb0749c7da4436e56/eb35e9f6f5210b212f35e8fc590abc6265b4ff2a5baf641eb0749c7da4436e56-json.log\",\"offset\":65477,\"timestamp\":\"2018-11-29T16:37:37.636000699Z\",\"ttl\":-2,\"type\":\"docker\",\"meta\":null,\"FileStateOS\":{\"inode\":526547,\"device\":2049}},{\"source\":\"/var/lib/docker/containers/f6ccfa38ecf5805f467bc9655df6fd2b8ec439d4bce6395f52d34a01034a6304/f6ccfa38ecf5805f467bc9655df6fd2b8ec439d4bce6395f52d34a01034a6304-json.log\","..., 13205) = 13205

We do have the registry_flush period configured to 30s as opposed to the default value of flushing after every batch update (Snippet from config below):

$ cat /etc/filebeat.yml
logging.level: error

filebeat.regsitry_flush: 30s

filebeat.config:
  prospectors:
    # Mounted `filebeat-prospectors` configmap:
    path: ${path.config}/prospectors.d/*.yml
    # Reload prospectors configs as they change:
    reload.enabled: false
  modules:
    path: ${path.config}/modules.d/*.yml
    # Reload module configs as they change:
    reload.enabled: false

filebeat.autodiscover:
  providers:
    - type: kubernetes
      hints.enabled: true

Does the registry flush period just control when the registry file is overwritten by registry.new, or is perhaps the flush interval specified being ignored? Are there any other configurable options that determine how often the registry file is written to, such as internal queue sizes?

Any guidance would be appreciated

Thanks,
Mike

you have a typo here:

filebeat.regsitry_flush: 30s

it should say: filebeat.registry_flush. Start with values like 1s, 2s, and 5s.

1 Like

@steffens You're a hero. I've stared at this config for so long and I just did not see that typo!

Thanks a lot

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.