I just finished reading the documentation for logstash and I am still processing a bit, perhaps I am asking this question to early. I'm not sure if I can utilize logstash the way I had hoped. We get logs bundled in zip files from customers delivered to us in a directory named after a case number. The directory structure of the zip is standardized. Currently we extract them into that folder and then manually read them. The below folder is a simplified example of what it looks like:
case1234
|-appliance node a
|- smb log
|- json object
|- jvm log
case1235
|-appliance node a
| |- smb log
| |- json object
| |- java log
|-appliance node b
|- smb log
|- json object
|- java log
etc...
I would instead like to extract these and then process them with logstash, tagging them with the case number and node number, and cleaning up their formatting. We could then build queries for elasticsearch to search for common issues, aggregate logs files from multiple nodes/days into single views, analyze usage data etc...
The issue I am having trouble wrapping my head around is which import plugin to use/how to import these logs. It seems like importing things with file expects files to always be in the same place. Would it make sense to generate a new config and reload logstash every time we want to scan a new directory?
Some googling shows I may be able to use a wild card, but I don't want to rescan thousands of already scanned log files every time we open a new case. It also seems like I might be able to build my own logic and pass logs to redis pre-tagged to feed them to logstash but it seems like duplicating a lot of the functionality that logstash already has.
Should I be building a custom importer for this? Perhaps I missed something simple? Is this an intended use case?