I have installed a ELK stack in production to get stats from our cdn logs. I have about 6 month of past logs to parse, with between 50 and 200 millions of event per day.
Logs are stored in gz. Currently, I have a script that uncompresses the logs (40 per batch) in a directory watched by Filebeat.
But I have no way to know when the 40 files have been parsed to start an other batch, so I'm doing by hand... Any idea how my script could know when filebeat as finished?
I thought I could based on the registry, but there is no info about the fact that files have been read to the end.