Hello,
I have a problem I've verified in filebeat 6.8.3, 7.5.1, and 7.16.3
I have a filebeat.yml filebeat.inputs, type log, with three configured paths with ** in the middle portions, something like
- /var/log/a/b/**/ONE
- /var/log/a/b/**/TWO
- /var/log/a/b/**/THREE
and on filebeat startup, in a situation where multiple files pre-exist, e.g. for the "**" portion above, I have C/D, C/E, C/F, e.g.
/var/log/a/b/C/D/ONE
/var/log/a/b/C/D/TWO
/var/log/a/b/C/D/THREE
/var/log/a/b/C/E/ONE
/var/log/a/b/C/E/TWO
/var/log/a/b/C/E/THREE
/var/log/a/b/C/F/ONE
/var/log/a/b/C/F/TWO
/var/log/a/b/C/F/THREE
filebeat will establish an input_id for it and start harvesters for all present files.
However, any NEW files that appear after startup (e.g. a new intermediary directory C/G with its own files ONE TWO THREE) will rarely have a harvester started up for it. Occasionally a single harvester will start up for a single file but this is very rare.
If no files exist on startup that match the input_id's configured paths, then filebeat will properly notice the three files appearing and start harvesters for this.
The above behavior confirmed both via filebeat -d 'input' and lsof on the process.
My read of the documentation and previous discussions (e.g. Filebeats not harvesting new file - #2 by pierhugues) suggest that filebeat should always find and harvest new files that match the configured paths.
I've tried modifying scan_frequency to no effect.
I'd appreciate any ideas to debug/solve, or confirmation to file a github issue.
Thank you.