Hi Folks,
I have a cluster with vectors indexed in a knn
index and I'd like to find exact K-nearest neighbors for a given vector using score_script. There are about 500K documents in total.
I'm using the following query to find the id
of top 10 nearest neighbors(documents) of the test_vector
after filtering documents that match test_key
{
"size": 10,
"_source": false,
"fields": ["id"],
"query": {
"script_score": {
"query": {
"match": {
"key.keyword": {{test_key}}
}
},
"script": {
"source": "knn_score",
"lang": "knn",
"params": {
"field": "vector",
"query_value": {{test_vector}},
"space_type": "innerproduct"
}
}
}
}
}
Following are my questions
- I observe that
request_cache
andquery_cache
is not used at all. I confirmedquery_cache
andrequest_cache
are enabled. I pulled the index stats before and after I queried and I see no change.
1.1 Can I expect score_script queries to be cached anywhere?
1.2 Can I update my search query in anyway to utilize caches and inturn improve search latency? - Updating match query to match "key.keyword" from "key" did not improve the latency. Initially, I used match query on just
key
field to filter documents matchingtest_key
. I found unexpected documentid
s in the result. So, I changed the query to match onkey.keyword
and expected it to reduce the search latency because matching on justkey
returned a lot more documents compared tokey.keyword
. I understand that the ideal solution will be to use explicit mapping for thekey
field to be interpreted as Keyword type and use Term queries for exact match. However, I was expecting this change to improve latency. Am i missing something here?
Thanks for your time in advance. Please let me know if you need more info. from my side.
Thanks