Please excuse me for my English.
We successfully doing many things through logstash. (parse with csv filter around 25 fields, decode URL more amazing things).
But thousands referer web request can't allow to analyze log.
For one click by user Proxy generate many string of logs and many categories.
Of course we tried use aggregate filter. It's worked if referrer url in one deep and user do new attempt to site.
for example
- Url_request : example.com/
- Url_request : blablabla.com/12321.txt referrer : example.com
We simple add new field with this condition if referer not exist request is main. if referer exist example.com - main and next do aggregate filter.
But what we do if request isn't "new"?
For example
User come from lunch and his browser stay in example.com (of course all aggregtion timeouts are lost). And he clicks to link in example.com.
- Url_request example.com/news1 referer: example.com
So. second example isn't big problem...
The big problem is
How to aggregate web request that have more 1 deep?
For example
1)Url_request : example.com/
2)Url_request : blablabla.com/12321.js referrer : example.com
3)Url_request : zxc.com referer : blablabla.com/12321.js
Good example is YouTube...
Thanks for any advice