Harvester not started for new files in configured paths

dchsueh · January 13, 2022, 9:46pm

Hello,

I have a problem I've verified in filebeat 6.8.3, 7.5.1, and 7.16.3

I have a filebeat.yml filebeat.inputs, type log, with three configured paths with ** in the middle portions, something like

  - /var/log/a/b/**/ONE
  - /var/log/a/b/**/TWO
  - /var/log/a/b/**/THREE

and on filebeat startup, in a situation where multiple files pre-exist, e.g. for the "**" portion above, I have C/D, C/E, C/F, e.g.

/var/log/a/b/C/D/ONE
/var/log/a/b/C/D/TWO
/var/log/a/b/C/D/THREE
/var/log/a/b/C/E/ONE
/var/log/a/b/C/E/TWO
/var/log/a/b/C/E/THREE
/var/log/a/b/C/F/ONE
/var/log/a/b/C/F/TWO
/var/log/a/b/C/F/THREE

filebeat will establish an input_id for it and start harvesters for all present files.

However, any NEW files that appear after startup (e.g. a new intermediary directory C/G with its own files ONE TWO THREE) will rarely have a harvester started up for it. Occasionally a single harvester will start up for a single file but this is very rare.

If no files exist on startup that match the input_id's configured paths, then filebeat will properly notice the three files appearing and start harvesters for this.

The above behavior confirmed both via filebeat -d 'input' and lsof on the process.

My read of the documentation and previous discussions (e.g. Filebeats not harvesting new file - #2 by pierhugues) suggest that filebeat should always find and harvest new files that match the configured paths.

I've tried modifying scan_frequency to no effect.

I'd appreciate any ideas to debug/solve, or confirmation to file a github issue.

Thank you.

dchsueh · January 13, 2022, 10:07pm

Restarting filebeat results in harvesters started to collect all files present (incl ones missed by previous process).

No change with symlinks:true (where /var/log -> /mnt/var/log on my system).

On 7.16.3, splitting into three separate inputs with one configured path each increases the probability that a harvester will start, but it will not get all new files. IIRC, splitting into separate inputs on 7.5.1 does not improve the situation.

dchsueh · January 24, 2022, 5:31pm

Anybody have any ideas?
I realize this works for most people; for me in some environments it seems to work reliably but in others never.

Anandu_D · February 7, 2022, 4:46pm

Did you find any solution for this problem

dchsueh · February 8, 2022, 9:46pm

no solution found yet

dchsueh · March 4, 2022, 10:07pm

no solution found yet

dchsueh · March 22, 2022, 11:02pm

no solution found yet

dchsueh · April 18, 2022, 3:19pm

no solution found yet

system · May 16, 2022, 5:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Prospectors from config_dir don't produce any harvesters Beats filebeat	3	1023	July 5, 2017
Filebeat not harvesting after startup Beats	4	2018	May 13, 2017
Filebeat does not harvest files after a restart Beats filebeat	4	1367	January 8, 2019
fileBeat isn't harvesting the logs from the last path's Beats filebeat	9	3628	December 27, 2017
Filebeat not picking up log files Beats filebeat	4	1501	February 8, 2022

Harvester not started for new files in configured paths

Related topics