Reports - Old Data - Oh My

We host 16 apache servers behind varnish. I have users (Marketing) who are not technical. They're not happy with the current stats software we have installed. I'm giving different webstats apps a try.

I've got ELK installed, and I like it well enough so far.

What I'd like to do is ..

  • Generate reports: Suzy from Marketing logs in and clicks a button 'Visits per hour' or 'GEO-IP of people from the UK'. Like that.

  • We have all logs from 2014 saved on disk. Can I get these inserted into ELK?

I am going to go search documentation, but if anyone wants to chime in with pointers, tips or etc please let me know.

You can use the file input (set start_position => beginning for this!) to read all the old files. The grok filter will help you parse the fields in each event. The date filter will help you tell logstash (and Elasticsearch) the correct time an event occurred.

As for viewing "visits per hour" etc, you can use Kibana for this. Kibana's dashboarding system lets you build a view on your data, and you can share that view with others :smile:

Awesome, Jordan. That is a huge help.

I've got this

#fileimport.conf
input {
file {
path => "/var/log/test/app01/doman.com-access_log-20141201"
type => "apache"
start_position => "beginning"
}
}

filter {
stuff
}

output {
elasticsearch {
cluster => "elasticsearch.local"
host => "127.0.0.1"
protocol => http
index => "muo-logs"
index_type => "apache"
}
}

I launched it like so from shell

$ /opt/logstash/bin/logstash -f fileimport.conf

However, since 'file are followed in a manner similar to 'tail -0f'" it executes but never returns control back to shell. Or at least that is what i surmise: index 'muo-logs' was built, and populated with a few hundred MB of data. Hours later I terminated the process.

Is the correct method to terminate that with '&' so it backgrounds? Or am I using the tool incorrectly?

No, you're on the right track. Logstash's file input will never consider a file "done" because it has no way of knowing when no more writes will take place. If you want Logstash to halt when a file has been processed you should feed the input via the stdin input instead.

Just to make sure I've got my ducks in a row pre-coffee ..

Change 'input' to this

input {
stdin {
type => "apache"
}
}

And execute like so

/opt/logstash/bin/logstash -f fileimport.conf < /var/log/test/serverlogfile

Yeah, that looks good.