Use Case Question - Zipped logs from appliance at customer site

fishpen0 · June 23, 2015, 5:46am

I just finished reading the documentation for logstash and I am still processing a bit, perhaps I am asking this question to early. I'm not sure if I can utilize logstash the way I had hoped. We get logs bundled in zip files from customers delivered to us in a directory named after a case number. The directory structure of the zip is standardized. Currently we extract them into that folder and then manually read them. The below folder is a simplified example of what it looks like:

case1234
  |-appliance node a
    |- smb log
    |- json object
    |- jvm log
case1235
  |-appliance node a
  | |- smb log
  | |- json object
  | |- java log
  |-appliance node b
    |- smb log
    |- json object
    |- java log
etc...

I would instead like to extract these and then process them with logstash, tagging them with the case number and node number, and cleaning up their formatting. We could then build queries for elasticsearch to search for common issues, aggregate logs files from multiple nodes/days into single views, analyze usage data etc...

The issue I am having trouble wrapping my head around is which import plugin to use/how to import these logs. It seems like importing things with file expects files to always be in the same place. Would it make sense to generate a new config and reload logstash every time we want to scan a new directory?

Some googling shows I may be able to use a wild card, but I don't want to rescan thousands of already scanned log files every time we open a new case. It also seems like I might be able to build my own logic and pass logs to redis pre-tagged to feed them to logstash but it seems like duplicating a lot of the functionality that logstash already has.

Should I be building a custom importer for this? Perhaps I missed something simple? Is this an intended use case?

warkolm · June 23, 2015, 5:50am

Wildcards are an option, but like you said it will rescan existing files rather than ignoring them (if they've been processed).

If you are unzipping these then it may make sense to cat them into LS as part of that process?

fishpen0 · June 23, 2015, 6:02am

So as the files are extracted I feed them into logstash? That might work, at least for most of the files I could use pipe to feed them right into LS and as long as they meet a filter I could make sure they are treated as the right type of file. I think I would lose information on the case number or which node it came from though.

magnusbaeck · June 23, 2015, 6:16am

I think I would lose information on the case number or which node it came from though.

Indeed, if you use the file path to pick up that information. But instead of feeding Logstash the data via stdin you could specify a wildcard to the file input and have a separate short-lived Logstash process for slurping those files, e.g. like this:

/path/to/logstash -e 'input { file { path => ["/path/to/case/appliance_a/*.log"] type => "whatever" } }' -f /path/to/other/configfiles

Topic		Replies	Views
Best practice for reimporting files into Logstash? Logstash	4	1069	June 20, 2017
Feed Logstash with gzipped multiline inputs Logstash	5	1088	August 18, 2017
ElasticSearch for the Log Search in Zipped Archives Usecase Elasticsearch	4	1220	November 4, 2022
Logstash for application.log + application.log.1.gz Logstash	6	154	March 1, 2024
Trying to understand how Logstash Redis input plugin works Logstash	4	406	September 23, 2022

Use Case Question - Zipped logs from appliance at customer site

Related topics