Hello,
I am using parent-child relationship for product and it's availability/variants in stores. One product can have have 1000s of stores as child, and there are about 80k products. We have 3 shards and 3 powerful nodes.
In main query, I have a filter to fetch all products which are available in specific store as shown below.
"filter": { "has_child": { "inner_hits": { "size": 12 }, "query": { "term": { "store_id": "m2s3" } }, "type": "stores" } }
Now I want to do aggregation on color and size. Those fields are available within child documents. Please see query below which does the child level aggregation only for the specific store.
{ "aggregations": { "color_agg": { "children": { "type": "stores" }, "aggregations": { "color_agg_sub": { "filter": { "terms": { "store_id": [ "m2s3" ] } }, "aggregations": { "color_agg_sub_sub": { "terms": { "field": "color.raw", "size": 25 } } } } } } } }
This works, but the performance is very bad - it takes about 1.4 second to execute. If I remove the child level aggregation, it just returns response in less than 60ms. Looks like it's applying child level aggregation first across all stores and then it's filtering out only those which are within specific store. Is there any way to do filtering first to filter out child level documents and then do aggregation? That may help make it faster. Or if there are any other way to improve the performance, let me know.
Thanks!