Finding most frequently accessed URI-paths / Transactions using Logstash

Hi,

My log file looks like this:

2015-06-12:00:08:54 100.220.144.1 GET /site/path1.aspx 200
2015-06-12:00:08:55 100.220.144.1 GET /site/path2.aspx 200
2015-06-12:00:08:56 100.220.144.1 GET /site/path3.aspx 200
2015-06-12:00:08:56 100.220.144.1 GET /site/path4.aspx 200
2015-06-12:00:08:57 100.220.144.1 GET /site/path1.aspx 200
2015-06-12:00:08:57 100.220.144.1 GET /site/path5.aspx 200
2015-06-12:00:08:58 100.220.144.1 GET /site/path1.aspx 200
2015-06-12:00:08:59 100.220.144.1 GET /site/path2.aspx 200
2015-06-12:00:08:59 100.220.144.1 GET /site/path3.aspx 200
2015-06-12:00:08:59 100.220.144.1 GET /site/path5.aspx 200
2015-06-12:00:08:59 100.220.144.1 GET /site/path4.aspx 200

Now, in the log, the most frequently accessed transaction or URL-path is:
(/site/path1.aspx -->/site/path2.aspx --> /site/path3.aspx)

Is there any way to detect this using Logstash? Please suggest.

This is what you'd normally use Elasticsearch for. What's the point of making Logstash perform this ranking? Given the example input above, what event(s) would you expect Logstash to produce?

Hi @magnusbaeck . Thanks for the response.

I am expecting a new column to be created named "userpath" which will have all the userpaths(collection of URIs in sequence between a entry URI(say, Login.jsp) and exit URI(say, LogOut.jsp)).

I tried the following code:

if [uri] == "Login.jsp" {
aggregate {
task_id => "%{clientip}"
code => "map['userpath'] = event.get('uri') ; map['userpath'] += ' --- ' "
map_action => "create"
}
}
else {
if [uri] != "Login.jsp" and [uri] != "Logout.jsp"{
aggregate {
task_id => "%{clientip}"
code => "map['userpath'] += event.get('uri') ; map['userpath'] += ' --- ' "
map_action => "update"
}
}
if [uri] == "Logout.jsp"{
aggregate {
task_id => "%{clientip}"
code => "map['userpath'] += event.get('uri') ; event.set('userpath', map['userpath'])"
map_action => "update"
end_of_task => true
push_previous_map_as_event => true
}
}
}

I am getting a partially correct result.

But, want to consider few more scenarios to get a more accurate result,like:
1 . If the user logs-In, but then doesn't click on LogOut, and again comes to Login Page. So, (Login.jsp --- /somePage1.jsp --- /somePage2.jsp --- Login.jsp ) should be the path.

2 . If the user logs in and timeout occurs(say, timeout = 15 mins). So in this case, from (Login.jsp --- whatever he has clicked till timeout) should be the userpath.
etc..

Please suggest.

I think this sounds like a good use-case for an entity-centric index.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.