Use Case Question - Zipped logs from appliance at customer site


#1

I just finished reading the documentation for logstash and I am still processing a bit, perhaps I am asking this question to early. I'm not sure if I can utilize logstash the way I had hoped. We get logs bundled in zip files from customers delivered to us in a directory named after a case number. The directory structure of the zip is standardized. Currently we extract them into that folder and then manually read them. The below folder is a simplified example of what it looks like:

case1234
  |-appliance node a
    |- smb log
    |- json object
    |- jvm log
case1235
  |-appliance node a
  | |- smb log
  | |- json object
  | |- java log
  |-appliance node b
    |- smb log
    |- json object
    |- java log
etc...

I would instead like to extract these and then process them with logstash, tagging them with the case number and node number, and cleaning up their formatting. We could then build queries for elasticsearch to search for common issues, aggregate logs files from multiple nodes/days into single views, analyze usage data etc...

The issue I am having trouble wrapping my head around is which import plugin to use/how to import these logs. It seems like importing things with file expects files to always be in the same place. Would it make sense to generate a new config and reload logstash every time we want to scan a new directory?

Some googling shows I may be able to use a wild card, but I don't want to rescan thousands of already scanned log files every time we open a new case. It also seems like I might be able to build my own logic and pass logs to redis pre-tagged to feed them to logstash but it seems like duplicating a lot of the functionality that logstash already has.

Should I be building a custom importer for this? Perhaps I missed something simple? Is this an intended use case?


(Mark Walkom) #2

Wildcards are an option, but like you said it will rescan existing files rather than ignoring them (if they've been processed).

If you are unzipping these then it may make sense to cat them into LS as part of that process?


#3

So as the files are extracted I feed them into logstash? That might work, at least for most of the files I could use pipe to feed them right into LS and as long as they meet a filter I could make sure they are treated as the right type of file. I think I would lose information on the case number or which node it came from though.


(Magnus B├Ąck) #4

I think I would lose information on the case number or which node it came from though.

Indeed, if you use the file path to pick up that information. But instead of feeding Logstash the data via stdin you could specify a wildcard to the file input and have a separate short-lived Logstash process for slurping those files, e.g. like this:

/path/to/logstash -e 'input { file { path => ["/path/to/case/appliance_a/*.log"] type => "whatever" } }' -f /path/to/other/configfiles

(system) #5