I intend to drop .json files into a directory that Filebeat is monitoring (they eventually end up in Elasticsearch)
The goal is to hit the data source for the latest changes every 15 minutes and output to a .json file. The number of changes will probably be pretty small on average. Total records today is <600k.
Once I'm done processing the file I would like to remove it.
Scenario A: Create a new .json file every 15 minutes with the changes.
Scenario B: Create a new .json file each day and append changes to it every 15 minutes.
Q1: Which scenario is the better route to go? Does it matter?
Q2: I am responsible for manually removing the .json file, correct?
If so, I was planning on reading the registry file and removing any files not found therein.
I am thinking that I should set
close_eof: true to close the file as soon as it is read.
At this point the file is still being monitored for changes and will continue to do so.
I, however, want to go ahead and remove this file from the registry and take it off the radar. The file will technically be "inactive" as soon as I get done reading it.
Q3: Should I set
clean_inactive to a "small" value, like 30 seconds?
However, this value should be greater than
ignore_older + scan_frequency and
ignore_older is supposed to be greater than
close_inactive really apply/matter if I am using
close_eof? The file will be closed long before
close_inactive would hit.
Hope my thoughts and questions haven't convoluted the simple problem of just wanting to remove files after processing.