Average of top n terms


(Roel) #1

Hallo,

In a certain index documents have a keyword, a rank and a timestamp. The rank for a keyword may differ from time to time. This means the dataset may look like this:

{"keywords": "piano", "rank" 1, "timestamp": 1437642812}
{"keywords": "piano", "rank" 2, "timestamp": 1437642813}
{"keywords": "electric guitar", "rank" 5, "timestamp": 1437644326}

I would like to get the average rank of the top 500 most occuring keywords. But I cannot find out how to do this. My current try-outs seem to always give the average for the entire dataset.

Roel


(Colin Goodheart-Smithe) #2

In the current version this is not possible, but with pipeline aggregations coming in version 2.0 you will be able to use the avg_bucket aggregation to do this: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-pipeline-avg-bucket-aggregation.html

In the mean time you would need to do an aggregation for the top 500 terms and perform the average calculation on the client side


(Roel) #3

Thank you for your answer.
I imagine this would work for a normal script, but is this also possible when I want to use the data for Kibana?

Roel


(Colin Goodheart-Smithe) #4

Yes this would work in 2.0 for requests straight to Elasticsearch. However, it will take some time for the functionality to be added to the Kibana interface. It is something the Kibana team are thinking about how to add though


(system) #5