We are indexing our Web Proxy logs and have a large number of usable fields...
timestamp, URL visited, bytes transferred etc...
What I'd like to do is create a visualisation that aggregates the URL's by the sum of bytes transferred...
I can get a count of of URLS i.e. 50,000 hits to facebook.com; 5,000 hits to youtube.com etc etc, but that doesn't show that those 5,000 hits to YouTube transferred 3x the amount of bandwidth
So what I'd like is to add-up all of the bytes transferred against {URLs} and do a top_N results against that..
So after playing around with the Visualization interface I was able to achieve exactly what I needed, I'll document this below in-case anyone else is trying to achieve the same results...
Create a new Visualization (type: TSVB)
Top N
Aggregation = Sum (on field bytes)
Sub Aggregation = Cumulative Sum on same field (bytes) shows up as Sum of bytes
Group by = Terms URL
This will then group the results by URL and provide the cumulative sum of those results
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.