Top k of top n: best way to do a nested aggregation


(geantbrun) #1

Hi,
I have millions of documents that contain (among others) two fields:
property1 and property2. Each of these fields can take many values in each
document. Suppose now that I want to list the top n values of property1 and
for each of these values, find the top k values of property2. For example,
the result of the top 2 of prop.2 for each of the top 3 of prop.1 could be:

prop1: YELLOW
prop2: heavyweight
prop2: useful

prop1: RED
prop2: unuseful
prop2: lightweight

prop1: BLACK
prop2: handy
prop2: lightweight

My first try with aggregations is:

{
"size": 0,
"aggs" : {
"topn" : {
"terms" : { "field" : "property1" , "size":3},
"aggs": {
"topk":{
"terms":{"field":"property2","size":2}
}
}
}
}
}

First: is it correct? Second: is is the best way to do it?
My guesses are yes and no.
Any help is appreciated.

Regards,
Patrick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b8c6e2a8-fc8c-44e7-acb3-c1f4b8a79863%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2