FileBeat - Logstash - Filter CSV - Elastic


(sridhar) #1

Hi @magnusbaeck,

I have installed Filebeat in local server where i want to pick some log files from different folders on daily basis . But also the folder contains data for past 30 days .
Will File beat work based on the current dated files and push to Log stash only once in a day or in other words no duplicate shipper .

And i also there are other prospector in the Filebeat.yml ,so should i create new one or additional prospector in same yml without disturbing existing config which is not going to use Log stash as Output ?


(ruflin) #2
  • If you only want to ship newer files, then you can use the ignore_older configuration option.
  • What do you mean by no duplicate shipper?
  • One filebeat instance can only have one output configuration. If you want to send one prospector to LS and an other one to ES, you need two filebeat instances.

(ruflin) #4

So the files containing the reports are new files created every day or are appended to old reports? Filebeat is tracking the modification date of a file. So if the file got modified it will fetch all the new lines added to the file.

For csv you would best use csv filter in logstash: https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html But I think that is what you meant above, right?


(sridhar) #6

@ruflin : Any valuable suggestions on the below case ?

Is it possible to provide date value ´date + %Y-%m-%d´in the filename for the path
eg: - /var/log/reports/completed/reports*-´date + %Y-%m-%d´.csv ?

Current scenario :
I have the following config and the folders contains past 90 days file . Each day new files with different name file is generated (15 files) . When i ran it for first time , it picked all the 90 days file and got some error too many open files for harvesting and at moment redis also stopped working . I cleared the registry file and kept it as empty and re-ran, still i saw it picked some random old dated files .
It doesnt pick the files which is not older than 24 hours .

Now how can i make it work as i wanted in config ? Do i need to remove registry or some cleanup.

-
  paths:
    - /var/log/reports/completed/reports*.csv
  input_type: log
  document_type: log
  tags: ["REPORT"]
  fields:
     app: recon_files
     ignore_older: 24h
     close_inactive: 1h
     clean_inactive: 25h
  fields_under_root: true

Err: Error setting up harvester: Harvester setup failed. Unexpected file opening error: Failed opening /var/log/reports/completed/reports-08-31_07-07-02_000505.csv: open /var/log/reports/completed/reports_2015-08-31_07-07-02_000505.csv: too many open files

> ERR Connecting error publishing events (retrying): lookup server on 10.xxx.x.xx:00: dial udp 10.xxx.x.xx:00: socket: too many open files

> 2017-01-25T15:57:48+01:00 ERR Connecting error publishing events (retrying): read tcp 10.xx.xx.xx:00000->10.xxx.x.xx:0000: read: connection reset by peer
> 2017-01-25T15:58:03+01:00 INFO Non-zero metrics in the last 30s: libbeat.redis.publish.read_errors=1 libbeat.redis.publish.write_bytes=14
> 2017-01-25T15:58:33+01:00 INFO No non-zero metrics in the last 30s
> ERR Writing of registry returned error: open /var/lib/filebeat/registry.new: too many open files. Continuing...
> ERR Failed to create tempfile (/var/lib/filebeat/registry.new) for writing: open /var/lib/filebeat/registry.new: too many open files
> Strangely you can see this one file is tried more than once and also without any successful .

> filebeat.2:2017-01-26T08:33:04+01:00 ERR Harvester could not be started on new file: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv, Err: prospector outlet closed
> filebeat.1:2017-01-26T08:33:09+01:00 INFO Harvester started for file: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv
> filebeat.1:2017-01-26T08:38:14+01:00 INFO File is inactive: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv. Closing because close_inactive of 5m0s reached.
> filebeat:2017-01-26T09:27:08+01:00 INFO Harvester started for file: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv
> filebeat:2017-01-26T09:32:13+01:00 INFO File is inactive: /var/log/reports/completed/reports_2016-10-25_07-17-02_000451.csv. Closing because close_inactive of 5m0s reached.

Filebeat is closed and also sometimes redis stopped .


(ruflin) #7

Let me know if this already solves your problem.


(ruflin) #10

I'm really sorry, I wanted to state date string CANNOT be used :frowning:


(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.