I have an issues managing aggregations in a products based search engine and was hoping someone could give me some pointers about a specific issue.
My product have different categories of tags (brand, packaging, size...) and can be filtered checking one or multiple filters in each category. Filtering the products is easy enough but my problem is to update the tag choices that are still relevant to the current search.
I hope it will be clear enough with an example:
If i select a specific brand, i need to remove the packaging tags that don't match any product with the selected brand. Again, simple enough here. The tricky part is that when i have selected a brand, i want to still be able to show the other brands so that the user check more of them and see more products.
To be clear, lets say i have Brand A, B, C and packaging D, E and F. If i select brand A and packaging D and E don't have any product with brand A, i need to stop showing them. But i still want to be able to select brands B and C to widen the search.
Is this something i can handle in a single Elasticsearch API call or should this be handled in my client application by keeping these filters in memory somehow ?
Ok, thanks that's helpfull. So for each tag category, i pass filters from each other tag categories and exclude the current one.
I'm not sure what you mean by "post-filter" tough. Do you mean apply a second filter to select only products matching the selected brand(s)? That would be a good idea actually, because i am currently doing separate requests for products display and categories/tags aggregations.
Sure! Happy to help It is a well-known use case, but if you are unaware of the post-filter it is difficult to find. I was tied up the rest of the afternoon, so I'm glad you figured it out
As for the filter on an aggregation the structure is like this:
{
"query":{
// your query
},
"aggs":{
"your_aggregation_name":{
"aggs":{
"a_name_again":{
"terms":{
"field": "your_field"
}
}
},
"filter":{
// either a terms clause or a bool if you need to combine clauses
}
},
"second_aggregation":{
// ...
}
},
"post-filter":{
//post-filter as described
}
}
So within an aggregation instead of launching straight into the terms clause, you will have a filter clause and another aggs-clause. The inner aggs-clause contains your terms-clause
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.