I'm using Elasticsearch version 7.1.1.
I'm experiencing a weird behaviour.
When using top_terms_n
in order to obtain a score in wildcard query_string
queries almost all results are missing. Without top_terms_n
I obtain 300 results out of 10M, however with the rewrite I obtain 3 results. l Is this behaviour expected? Is there any way I can obtain all the results and still have relevant scores calculated in wildcard matches?
The following query returns about 300 results (out of 10M), without rewrite.
{
"query": {
"query_string": {
"query": " +_data.cern_id:(ELG-*) +object_type:d +_data.author.full_name:eric"
}
}
}
However if I search with rewrite top_terms_n
I only get 3 results back.
{
"query": {
"query_string": {
"query": " +_data.cern_id:(ELG-*) +object_type:d +_data.author.full_name:eric",
"rewrite":"top_terms_1000"
}
}
}
The most confusing part is that if I change the wildcard to be more specific, eg instead of ELG-*
searching ELG-GENNET-*
, maintain the remaining queries I obtain more results, 8 instead of 3.
{
"query": {
"query_string": {
"query": " +_data.cern_id:(ELG-GENNET-*) +object_type:d +_data.author.full_name:eric",
"rewrite":"top_terms_1000"
}
}
}
Aditional details: I'm using dfs_query_then_fetch
.
I'm setting the mentioned field cern_id
as a keyword:
"cern_id": {
"type": "keyword",
"normalizer": "case_accent_normalizer",
"boost": 5
},
And the mentioned case_accent_normalizer
normalizer is
"case_accent_normalizer": {
"type": "custom",
"filter": [
"lowercase",
"asciifolding"
]
}
Thanks in advance!