I have a product catalog as Elasticsearch (8.16.1) index.
Each doc represent a product with different fields, such as name, color, etc.
I've added a vector as additional field to each product / doc.
My search is can be a free text, such as "red dress".
I am transforming this free text to vector and find the closest products with knn
query which works great and return relevant products with 0 < _score
< 1.0.
I am also querying with multi_match
that works great but provide _score
with unknown range (somewhere between 0 < _score
< 25.0, but I can't point what the highest value is).
I want to combine those two approaches together, with an equal weight (or kind of control boost value) between knn and multi_match.
Since the knn has a lower _score
range (top 1.0) - the impact of its result is negligible and the multi_match impact too much.
Here is my query, how can I change it / improve it to get the the same contribution for the knn and the multi_match?
{
"bool": {
"should": [
{
"knn": {
"field": "embedding_vector",
"query_vector": [
1,
2,
3
],
"num_candidates": 10000,
"filter": [
{
"term": {
"categories.keyword": "Dresses"
}
}
]
}
},
{
"multi_match": {
"query": "red dress",
"fields": [
"categories",
"name",
"color"
],
"fuzziness": "AUTO",
"type": "best_fields"
}
}
]
}
}
Thanks!