Hi Mark, a try to explain a limit case:
POST test/doc/1
{
"people":["john", "joe", "rob"],
"company":["Just Js"]
}
POST test/doc/2
{
"people":["john", "joe", "rob"],
"company":["Only Os"]
}
POST test/doc/3
{
"people":["bob"],
"company":["Apple", "IBM", "Microsoft"]
}
POST test/doc/4
{
"people":["diego", "alice"],
"company":["Apple", "IBM", "Microsoft"]
}
POST test/doc/5
{
"people":["jose"],
"company":["Apple", "IBM", "Microsoft"]
}
with that document distribution we have:
Top Peoples:
joe (2)
john (2)
rob (2)
alice (1)
bob (1)
diego (1)
jose (1)
Top Companies
Apple (3)
IBM (3)
Microsoft (3)
Just Js (1)
Only Os (1)
so if the first aggregation use peoples like this
POST test/_search
{
"size": 0,
"aggs": {
"companies": {
"terms": {
"field": "people.keyword",
"size" : "3"
},
"aggs": {
"employees": {
"terms": {
"field": "company.keyword",
"size" : "3"
}
}
}
}
}
note the size parameter set to 3 (because we a have a very small set of documents (5) )
"aggregations": {
"peoples": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 4,
"buckets": [
{
"key": "joe",
"doc_count": 2,
"employees": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Just Js",
"doc_count": 1
},
{
"key": "Only Os",
"doc_count": 1
}
]
}
},
{
"key": "john",
"doc_count": 2,
"employees": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Just Js",
"doc_count": 1
},
{
"key": "Only Os",
"doc_count": 1
}
]
}
},
{
"key": "rob",
"doc_count": 2,
"employees": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Just Js",
"doc_count": 1
},
{
"key": "Only Os",
"doc_count": 1
}
]
}
}
]
}
}
then if the first aggregation use companies like this
POST /test/_search
{
"size": 0,
"aggs": {
"companies": {
"terms": {
"field": "company.keyword",
"size" : "3"
},
"aggs": {
"employees": {
"terms": {
"field": "people.keyword",
"size" : "3"
}
}
}
}
}
}
the response is:
"aggregations": {
"companies": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 2,
"buckets": [
{
"key": "Apple",
"doc_count": 3,
"employees": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 1,
"buckets": [
{
"key": "alice",
"doc_count": 1
},
{
"key": "bob",
"doc_count": 1
},
{
"key": "diego",
"doc_count": 1
}
]
}
},
{
"key": "IBM",
"doc_count": 3,
"employees": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 1,
"buckets": [
{
"key": "alice",
"doc_count": 1
},
{
"key": "bob",
"doc_count": 1
},
{
"key": "diego",
"doc_count": 1
}
]
}
},
{
"key": "Microsoft",
"doc_count": 3,
"employees": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 1,
"buckets": [
{
"key": "alice",
"doc_count": 1
},
{
"key": "bob",
"doc_count": 1
},
{
"key": "diego",
"doc_count": 1
}
]
}
}
]
}
}
So in this example we can note:
- if I use agg on "peoples" & sub agg on "companies" I lost the top companies, in the same way
- if I use agg on "companies" & sub agg on "peoples" I lost the top peoples.
Diego