What is the best way to make products more relevant outside of the default
scoring?
I have an unknown number of business rules that will dictate a document's
"relativity". Meaning, if one document scores higher than the other, it's
possible that the other document will be more relevant to the user.
Given two products with similar titles but different attributes and the
query "ipad", I'd like to promote one over the other:
{
"title_simple": "iPad Mini Case",
"description_simple": "Royce Leather iPad Mini Case:...",
"category": "Computers & Accessories",
"brand" : "Royce Leather",
"id": 794809052574
}
{
"title_simple": "Apple iPad mini (16GB, Wi-Fi + Sprint 4G, White)",
"description_simple": "iPad mini features a beautiful 7.9" display...",
"category": "Electronics",
"brand" : "Apple",
"id": 885909689712
}
A simple query scores the iPad case high:
{
"query": { "term": { "title_simple": "ipad" }}
}
But business rules dictate that the actual iPad be on the top.
I can run a filter or score based on the attribute or brand to get what I'm
looking for:
{
"query": {
"function_score": {
"query": { "term": { "title_simple": "ipad" } },
"functions" : [{
"filter" : { "term": { "category_simple": "electronics" }
},
"boost_factor" : 2
}]
}
}
}
But building a bunch of these isn't scalable or reasonable.
I have an unknown number of these and that number will continue to grow.
Some other examples:
- query "xbox" should promote consoles over games
- query "macbook" should promote Apple computers over macbook sleeves
- query "Apple" should promote Apple products and not food
Building a thousand queries based on functions filters is unreasonable and
unscalable.
Some possible solutions I've considered:
- building a lookup table that will build the filter portion of the query
(this could get unmaintainable) - Including a pre-calculated score in the document (unfortunately, doesn't
work on a per query basis, as the score may change based on the user's
needs) - Extending the DefaultSimilary class (I'm not sure how this helps me in
this scenario, though)
What have other people done to solve these problems? Is there something
else that I'm missing that could help?
Here's a runnable gist -
https://gist.github.com/dlmitchell/826e8fb7ca89bed30e4a/raw/613be2c202b26faaaa5899bdcfeac714737beb49/sample_mapping.sh
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/70849d62-822a-4bb6-99f4-d9400d091fa9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.