Hi community,
Given the following index which contains media and its related products (simplified):
{
"media": {
"mappings": {
"doc": {
"properties": {
"color": {
"type": "keyword"
},
"product": {
"type": "nested",
"properties": {
"identifier": {
"type": "keyword"
},
"color": {
"type": "keyword"
}
}
}
}
}
}
}
}
Where the color property can either be set on the media itself or on the product or both. I'd like to get a bucket aggregation for the property color, no matter if it is set on the media or on the product.
So what I did was bascially two aggregations (one for each field), and calculated the sum for each term of the two fields (afterwords with a script):
{
"aggs": {
"media_color_bucket": {
"terms": {
"field": "color"
}
},
"product_color_bucket": {
"nested": {
"path": "product"
},
"aggs": {
"color": {
"terms": {
"field": "product.color"
}
}
}
}
}
}
Now if there's e.g. a media which has color = "red" and there's also an assigned product with color="red" and I sum up the two fields i get 2 as doc count although these two matches are found on the same "root"-document.
So what I'm looking for is sth. like an aggregation, where I can combine the terms of these two fields into one aggregation and if they occur multiple times on the same "root"-document it should only count once.
Or in other words: For each color-term I want the count of media documents that either have the color itself or a related product with the color.
Is there a way to reach this?
Regards
Tobi