Hi,
I'm seeing something strange with filebeat 5.0.0-alpha3. I have prospectors configured on my mesos slave task stdout/stderr files and occasionally I see filebeat get stuck in a state where it's repeatedly sending the same log messages from the same file over and over again. In the latest case of this, I see the following in the filebeat logs around the time when the issue started:
/var/log/filebeat/filebeat.log:2016-08-01T16:39:22Z INFO Harvester started for file: /tmp/mesos/slave/slaves/d1dc259c-bd93-4ed6-8906-5124cab07048-S228/frameworks/Singularity/executors/registrator-7_dev_3-1468270708127-7-platform_apps_mesos_slave_3.node.us_east_1_z.consul-us_east_1c/runs/latest/stderr
/var/log/filebeat/filebeat.log.1:2016-08-01T16:20:14Z INFO Harvester started for file: /tmp/mesos/slave/slaves/d1dc259c-bd93-4ed6-8906-5124cab07048-S228/frameworks/Singularity/executors/registrator-7_dev_3-1468270708127-7-platform_apps_mesos_slave_3.node.us_east_1_z.consul-us_east_1c/runs/latest/stderr
/var/log/filebeat/filebeat.log.3:2016-08-01T16:20:11Z INFO Harvester started for file: /tmp/mesos/slave/slaves/d1dc259c-bd93-4ed6-8906-5124cab07048-S228/frameworks/Singularity/executors/registrator-7_dev_3-1468270708127-7-platform_apps_mesos_slave_3.node.us_east_1_z.consul-us_east_1c/runs/latest/stderr
/var/log/filebeat/filebeat.log.6:2016-08-01T16:20:06Z INFO Harvester started for file: /tmp/mesos/slave/slaves/d1dc259c-bd93-4ed6-8906-5124cab07048-S228/frameworks/Singularity/executors/registrator-7_dev_3-1468270708127-7-platform_apps_mesos_slave_3.node.us_east_1_z.consul-us_east_1c/runs/latest/stderr
The file in question is not very big:
> cat /tmp/mesos/slave/slaves/d1dc259c-bd93-4ed6-8906-5124cab07048-S228/frameworks/Singularity/executors/registrator-7_dev_3-1468270708127-7-platform_apps_mesos_slave_3.node.us_east_1_z.consul-us_east_1c/runs/latest/stderr | wc -l
62568
My prospector config for the mesos stderr taks logs looke like this:
prospectors:
- paths:
- "/tmp/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stderr*"
exclude_files:
- "^.+logstash.+$"
- "^.+logrotate.conf$"
ignore_older: 10m
multiline:
pattern: "^[[:space:]]"
match: after
From the log output above, it looks like several harvesters were started within a handful of seconds of each other for the same file. Soon after this, I'm seeing the same log messages from this file forwarded over and over which ends up flooding my pipeline (I'm forwarding to kafka and indexing with logstash consuming from kafka). Is it possible that the multiple separate harvesters are each updating offsets which cause them to repeatedly reset and resend messages?
Note that I'm using 5.0.0-alpha-3. It looks like there was a lot of work in this area and I'm hopeful the changes in https://github.com/elastic/beats/issues/1600 might help. It's also unclear if all the new close_/clean_ options talked about in the 5.0.0-alpha4 documentation and the above issue are actually available in the 5.0.0-alpha4 release or are still yet to be released and are only in daily builds? Any hints on when the next 5.0.0 release will be coming?