Filebeat not picking the files as expected


(sridhar) #1

I have the following config ,

  • /var/lib/transfers/report//archive/processed/ABC-Reconciliation.csv --> which is new reports files everyday

Each day new files with different name file for one of the defined prospector is generated (15 files) . When i ran it for first time , it picked all the 90 days file and got some error too many open files for harvesting and at moment redis also stopped working . I cleared the registry file and kept it as empty and re-ran, still i saw it picked some random old dated files .
It doesnt pick the files which is not older than 24 hours .

Now none of the files are picked and it stopped completely , only it works for sometime whenever we restart the
filebeat and after sometime it stops harvesting the files .

P.S. : - /home/a/b/logs/check.log --> this is dynamic logfile and rotated everyday

How can i make it work as i configured ?


(Mark Walkom) #3

Please make your thread titles a little more descriptive.


(ruflin) #4
  • Please share your FB version
  • Your full config file
  • Your logs

And as mark asked, please make your title more descriptive. Thx.


(sridhar) #5

Hi @ruflin,

Please find the details ,

/usr/share/filebeat/bin/filebeat-god -version
Go daemon v1.2
grep -v "#" /etc/filebeat/filebeat.yml
filebeat:
  prospectors:
    -
      paths:
        - /home/a/c/main.log
      input_type: log
      fields:
         app: main
      fields_under_root: true
      document_type: log
    -
      paths:
        - /home/a/b/logs/check.log
      input_type: log
      document_type: log
      fields:
         app: trans
      fields_under_root: true
    -
      paths:
        - /var/lib/transfers/report/*/archive/processed/ABC-Reconciliation*.csv
      input_type: log
      document_type: log
      tags: ["REPORT"]
      fields:
         app: recon_files
         ignore_older: 24h
         close_inactive: 1h
         clean_inactive: 25h
      fields_under_root: true
registry_file: /var/lib/filebeat/registry
output:
 redis:
    hosts: ["a-redis-loadbalancer.sys:1234"]
    key: "filebeat"
    db: 0
 timeout: 5
 reconnect_interval: 1
shipper:
logging:
files:

(sridhar) #6

I could not attach the logfiles unfortunately due to some restriction .

2017-01-27T15:37:37+01:00 INFO Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat]
2017-01-27T15:37:37+01:00 INFO Setup Beat: filebeat; Version: 5.0.0
2017-01-27T15:37:37+01:00 INFO Max Retries set to: 3
2017-01-27T15:37:37+01:00 INFO Activated redis as output plugin.
2017-01-27T15:37:37+01:00 INFO Publisher name: abcd-rep01.host.sys
2017-01-27T15:37:37+01:00 INFO Flush Interval set to: 1s
2017-01-27T15:37:37+01:00 INFO Max Bulk Size set to: 2048
2017-01-27T15:37:37+01:00 INFO filebeat start running.
2017-01-27T15:37:37+01:00 INFO Registry file set to: /var/lib/filebeat/registry
2017-01-27T15:37:37+01:00 INFO Loading registrar data from /var/lib/filebeat/registry
2017-01-27T15:37:37+01:00 ERR Error decoding states: EOF
2017-01-27T15:37:37+01:00 ERR Error loading state: EOF
2017-01-27T15:37:37+01:00 ERR Could not start registrar: EOF
2017-01-27T15:37:37+01:00 INFO Loading Prospectors: 3
2017-01-27T15:37:37+01:00 INFO Load previous states from registry into memory
2017-01-27T15:37:37+01:00 INFO Previous states loaded: 0
2017-01-27T15:37:37+01:00 INFO Load previous states from registry into memory
2017-01-27T15:37:37+01:00 INFO Previous states loaded: 0
2017-01-27T15:37:37+01:00 INFO Load previous states from registry into memory
2017-01-27T15:37:37+01:00 INFO Previous states loaded: 0
2017-01-27T15:37:37+01:00 INFO Loading Prospectors completed. Number of prospectors: 3
2017-01-27T15:37:37+01:00 INFO All prospectors are initialised and running with 0 states to persist
2017-01-27T15:37:37+01:00 INFO Starting prospector of type: log
2017-01-27T15:37:37+01:00 INFO Starting spooler: spool_size: 2048; idle_timeout: 5s
2017-01-27T15:37:37+01:00 INFO Start sending events to output
2017-01-27T15:37:37+01:00 INFO Starting prospector of type: log
2017-01-27T15:37:37+01:00 INFO Starting prospector of type: log
2017-01-27T15:37:37+01:00 INFO Harvester started for file: /home/a/b/logs/check.log
2017-01-27T15:37:37+01:00 INFO Harvester started for file: /var/lib/transfers/report/A/archive/processed/ABC-ReconciliationA_2016-09-25_07-30-06_002046.csv
2017-01-27T15:37:37+01:00 INFO Harvester started for file: /var/lib/transfers/report/A/archive/processed/ABC-ReconciliationA_2016-02-22_07-30-06_000809.csv
2017-01-27T15:37:37+01:00 INFO Harvester started for file: /var/lib/transfers/report/A/archive/processed/ABC-ReconciliationA_2016-09-05_07-35-01_000702.csv
..
..
2017-01-27T15:38:07+01:00 INFO Non-zero metrics in the last 30s: libbeat.redis.publish.read_bytes=2297 filebeat.harvester.open_files=413 libbeat.redis.publish.wr
ite_bytes=1438936 publish.events=4096 filebeat.harvester.running=412 filebeat.harvester.started=412 libbeat.publisher.published_events=4095
2017-01-27T15:38:37+01:00 INFO No non-zero metrics in the last 30s
2017-01-27T15:39:07+01:00 INFO No non-zero metrics in the last 30s
2017-01-27T15:39:37+01:00 INFO No non-zero metrics in the last 30s
2017-01-27T15:41:37+01:00 INFO No non-zero metrics in the last 30s
2017-01-27T15:42:07+01:00 INFO No non-zero metrics in the last 30s
2017-01-27T15:42:37+01:00 INFO No non-zero metrics in the last 30s
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/A/archive/processed/ABC-ReconciliationA_2016-03-04_07-23
-01_000664.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/B/archive/processed/ABC-ReconciliationA_2016-05-15_16-0
2-21_000279.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/C/archive/processed/ABC-ReconciliationA_2016-04-13_13-27
-26_000798.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/H/archive/processed/ABC-ReconciliationA_2016-04-25_07-1
1-01_000159.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/E/archive/processed/ABC-ReconciliationA_2016-11-03_23-4
4-31_002103.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/G/archive/processed/ABC-ReconciliationA_2015-10-29_07-10
-02_000665.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/F/archive/processed/ABC-ReconciliationA_2016-11-13_07-26
-01_001038.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/B/archive/processed/ABC-ReconciliationA_2016-12-07_03-4
4-51_000500.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/H/archive/processed/ABC-ReconciliationA_2016-01-30_07-11-02_000067.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:42:42+01:00 INFO File is inactive: /var/lib/transfers/report/H/archive/processed/ABC-ReconciliationA_2016-09-12_07-11-02_000311.csv. Closing because close_inactive of 5m0s reached.
2017-01-27T15:43:07+01:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.open_files=-10
2017-01-27T15:44:07+01:00 INFO No non-zero metrics in the last 30s
...
...
2017-01-30T10:51:07+01:00 INFO No non-zero metrics in the last 30s

(sridhar) #7

I have added harvester_limit value=1 and restarted filebeat but still same , it just picked few lines for check.log prospector but after that nothing .
Also i see some strange errors below during the startup for the registry files . Is it something to looked upon ?

> 2017-01-30T14:33:13+01:00 INFO filebeat start running.
> 2017-01-30T14:33:13+01:00 INFO Registry file set to: /var/lib/filebeat/registry
> 2017-01-30T14:33:13+01:00 INFO Loading registrar data from /var/lib/filebeat/registry
> _**2017-01-30T14:33:13+01:00 ERR Error decoding states: EOF**_
> _**2017-01-30T14:33:13+01:00 ERR Error loading state: EOF**_
> _**2017-01-30T14:33:13+01:00 ERR Could not start registrar: EOF**_
> 2017-01-30T14:33:13+01:00 INFO Loading Prospectors: 3
> 2017-01-30T14:33:13+01:00 INFO Load previous states from registry into memory
> 2017-01-30T14:33:13+01:00 INFO Previous states loaded: 0
> 2017-01-30T14:33:13+01:00 INFO Load previous states from registry into memory
> 2017-01-30T14:33:13+01:00 INFO Previous states loaded: 0
> 2017-01-30T14:33:13+01:00 INFO Load previous states from registry into memory
> 2017-01-30T14:33:13+01:00 INFO Previous states loaded: 0
> 2017-01-30T14:33:13+01:00 INFO Loading Prospectors completed. Number of prospectors: 3
> 2017-01-30T14:33:13+01:00 INFO All prospectors are initialised and running with 0 states to persist

> 2017-01-30T14:42:17+01:00 INFO Total non-zero values:  libbeat.redis.publish.read_bytes=2301 **filebeat.harvester.closed=1** **filebeat.harvester.open_files=5226** publish.events=4096 libbeat.redis.publish.write_bytes=1438936 filebeat.harvester.started=1 libbeat.publisher.published_events=4095
> 2017-01-30T14:42:17+01:00 INFO Uptime: 9m3.397713341s
> 2017-01-30T14:42:17+01:00 INFO filebeat stopped.

(sridhar) #8

@warkolm @ruflin Anyother information could help to investigate this further ?
I strongly doubt something with registry files , as i see it is 0 bytes .

And for the recon prospector , it just picks the files randomly doesn't seem to be working as configured ignore_older or something . I enabled debug mode aswell, but no errors found except for the registry files . Even after the restart, no new registry files created or contents changed .(0 byte)


(sridhar) #9

@Nicolas_Ruflin @ruflin @warkolm @steffens

  1. I have figured out the first issue , registry files was somehow broken and i moved that to different location and restarted it which creates the new registry files and logs were sent successfully to redis.

2. However for the recon_files prospector. (Updated as per Ruflin suggestion harvester_limit)

>   paths:
>     - /var/lib/transfers/report/*/archive/processed/ABC-Reconciliation*.csv
>   input_type: log
>   document_type: log
>   tags: ["REPORT"]
>   fields:
>      app: recon_files
>      ignore_older: 24h
>      close_inactive: 1h
>      clean_inactive: 25h
>      harvester_limit: 9
>      scan_frequency: 10m
>   fields_under_root: true

It picks all the files in the path mentioned and error-ed out with too many open files , which is not ideal as per config . As it should ignore the files older than 24 hours . Having said that , these are the files were never harvested and it is first time running with filebeat for this location .
So there seems to be something missing in the configuration specific.

I have reproduced the issue and it is issue with filename containing "*.csv" which is then including everything in the directory even though Ignore_older is configured , maybe it is a bug in filebeat configuration .

Can you tell me how can i use date parameter for the filename in the path to pick only today's logfile ?

Or i have to move the today's logfiles alone to separate folder with additional script and update the path to tweak the current situation .


(ruflin) #10

harvester_limit needs to be on inside the prospector not inside fields. Same is true for all the other config options in there. Not sure why you put it under fields?


(sridhar) #11

@ruflin

I have corrected the config now , is that simple enough to make it working .
I was referring to some old bad examples than .

2017-01-31T13:31:05+01:00 DBG Ignore file because ignore_older reached: /var/lib/transfers/report/abc/archive/processed/ABC-Reconciliation_ABC_2016-10-12_07-11-01_000349.csv

One last thing can i add multiple paths for the same prospector .
Eg:

>  paths:
>     - /var/lib/transfers/report/A/archive/processed/ABC-Reconciliation*.csv
> 	  - /var/lib/transfers/report/B/archive/processed/BAC-Reconciliation*.csv
> 	  - /var/lib/transfers/report/C/archive/processed/CAB-Reconciliation*.csv
> 	  - /var/lib/transfers/report/D/archive/processed/AC-Reconciliation*.csv
> 	  - /var/lib/transfers/report/E/archive/processed/BC-Reconciliation*.csv
> 	  - /var/lib/transfers/report/F/archive/processed/AB-Reconciliation*.csv
> 	  - /var/lib/transfers/report/G/archive/processed/BA-Reconciliation*.csv
> 	  - /var/lib/transfers/report/H/archive/processed/CA-Reconciliation*.csv
>     input_type: log
>     document_type: log
>     tags: ["REPORT"]
> 	  ignore_older: 24h
>     close_inactive: 1h
>     clean_inactive: 25h
>     harvester_limit: 8
>     scan_frequency: 10m
>     fields:
>        app: recon_files
>     fields_under_root: true`

Is this valid config ?


(sridhar) #12

Okay it worked .


(ruflin) #13
  • Multiple paths: yes
  • Your config looks still off. everything like input_type should have the same indentation as paths.

(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.