How would you get full page hits out of apache logs?

jerrac · October 24, 2015, 12:09am

Apache logs a message for every request. So you get a separate entry for every image/js/css file on every page. How would you filter that down to only hits on actual pages?

As in, a user comes to /about. Apache logs hits to /about, /js/blah.js, /css/style.css, and so on. How would you make sure you only get /about?

tbragin · October 24, 2015, 1:58pm

You would need something to actually make that correlation before you insert the data into Elasticsearch - Kibana can't do that with no additional information.

Since, as you note, Apache logs are stateless and only contain individual HTTP requests, tracking performance of the whole webpage usually requires adding additional tracking code to your website and basing your analytics on that. I'm familiar with boomerang developed by Yahoo, but there are probably others. You can also do it using your own code, as this blog describes.

jerrac · October 24, 2015, 7:47pm

Sticking tracking code in is the normal way to do it. But I was mainly looking for something like a pattern that helps exclude the most obviously non-page hits.

Basically, parsing apache logs is how you would get stats if you couldn't insert tracking code. It also might be more accurate since many people block javascript and/or Google Analytics.Or if you can't put tracking code on a page for some other reason.

Anyway, I was just curious to know if anyone had done something like this.

tbragin · October 24, 2015, 9:48pm

Gotcha, thanks for explaining further.

If you're looking to exclude files with certain extensions, you could exclude certain file types at the filter level (see attached). My worry is that the list might get quite long, though, so not sure how effective that'll be.

Topic		Replies	Views
Given Apache logs in ES, how do I get this data out into Elasticsearch Kibana	6	2247	July 6, 2017
Need some ideas: Getting visits from hits out of logstash index Elasticsearch	5	364	July 6, 2017
Logstash custom apache logs Logstash	3	780	July 6, 2017
Visualizing API endpoint hit counts in Kibana Kibana	4	2861	July 17, 2017
Getting sample apache logs into Kibana Kibana	5	2381	July 6, 2017

How would you get full page hits out of apache logs?

Related topics