FileBeat - Logstash - Filter CSV - Elastic

Hi @magnusbaeck,

I have installed Filebeat in local server where i want to pick some log files from different folders on daily basis . But also the folder contains data for past 30 days .
Will File beat work based on the current dated files and push to Log stash only once in a day or in other words no duplicate shipper .

And i also there are other prospector in the Filebeat.yml ,so should i create new one or additional prospector in same yml without disturbing existing config which is not going to use Log stash as Output ?

  • If you only want to ship newer files, then you can use the ignore_older configuration option.
  • What do you mean by no duplicate shipper?
  • One filebeat instance can only have one output configuration. If you want to send one prospector to LS and an other one to ES, you need two filebeat instances.

So the files containing the reports are new files created every day or are appended to old reports? Filebeat is tracking the modification date of a file. So if the file got modified it will fetch all the new lines added to the file.

For csv you would best use csv filter in logstash: https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html But I think that is what you meant above, right?

@ruflin : Any valuable suggestions on the below case ?

Is it possible to provide date value ´date + %Y-%m-%d´in the filename for the path
eg: - /var/log/reports/completed/reports*-´date + %Y-%m-%d´.csv ?

Current scenario :
I have the following config and the folders contains past 90 days file . Each day new files with different name file is generated (15 files) . When i ran it for first time , it picked all the 90 days file and got some error too many open files for harvesting and at moment redis also stopped working . I cleared the registry file and kept it as empty and re-ran, still i saw it picked some random old dated files .
It doesnt pick the files which is not older than 24 hours .

Now how can i make it work as i wanted in config ? Do i need to remove registry or some cleanup.

-
  paths:
    - /var/log/reports/completed/reports*.csv
  input_type: log
  document_type: log
  tags: ["REPORT"]
  fields:
     app: recon_files
     ignore_older: 24h
     close_inactive: 1h
     clean_inactive: 25h
  fields_under_root: true

Err: Error setting up harvester: Harvester setup failed. Unexpected file opening error: Failed opening /var/log/reports/completed/reports-08-31_07-07-02_000505.csv: open /var/log/reports/completed/reports_2015-08-31_07-07-02_000505.csv: too many open files

> ERR Connecting error publishing events (retrying): lookup server on 10.xxx.x.xx:00: dial udp 10.xxx.x.xx:00: socket: too many open files

> 2017-01-25T15:57:48+01:00 ERR Connecting error publishing events (retrying): read tcp 10.xx.xx.xx:00000->10.xxx.x.xx:0000: read: connection reset by peer
> 2017-01-25T15:58:03+01:00 INFO Non-zero metrics in the last 30s: libbeat.redis.publish.read_errors=1 libbeat.redis.publish.write_bytes=14
> 2017-01-25T15:58:33+01:00 INFO No non-zero metrics in the last 30s
> ERR Writing of registry returned error: open /var/lib/filebeat/registry.new: too many open files. Continuing...
> ERR Failed to create tempfile (/var/lib/filebeat/registry.new) for writing: open /var/lib/filebeat/registry.new: too many open files
> Strangely you can see this one file is tried more than once and also without any successful .

> filebeat.2:2017-01-26T08:33:04+01:00 ERR Harvester could not be started on new file: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv, Err: prospector outlet closed
> filebeat.1:2017-01-26T08:33:09+01:00 INFO Harvester started for file: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv
> filebeat.1:2017-01-26T08:38:14+01:00 INFO File is inactive: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv. Closing because close_inactive of 5m0s reached.
> filebeat:2017-01-26T09:27:08+01:00 INFO Harvester started for file: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv
> filebeat:2017-01-26T09:32:13+01:00 INFO File is inactive: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv. Closing because close_inactive of 5m0s reached.

Filebeat is closed and also sometimes redis stopped .

Let me know if this already solves your problem.

I'm really sorry, I wanted to state date string CANNOT be used :frowning:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.