I have to pull in some logs which will be dumped on a shared windows drive.
- The files are small. But their number will be high.
- There is no chance of files appearing again once they have been processed and deleted.
- The file names have timestamp and hence will be unique.
ELK stack is running on Linux. I have been able to mount the windows drive via cifs.
I had in past tried to pull in logs from a shared drive using filebeats running on a windows machine. And I quickly got burnt (Re-Ingesting of existing data).
So I am being extra cautious.
The plan is:
- Logstash to monitor the shared drive mounted via cifs.
- On finding a file we process it and then let Logstash delete the file from the shared folder to avoid ingesting it again.
This is the config I have now:
input
{
file
{
path => "/Data/*.csv"
start_position => "beginning"
mode => "read"
sincedb_path => "/Data/logstash/fileTracker.txt"
id => "my_file_input"
}
}
Please let me know of any potential issues you see with this setup.