Hi all.
A fuzzy query is making me crazy. It returns me some result that I could understand but it doesn't return some other that, for me, are more similar than the ones it returns.
This is the query:
"query":{
"nested" : {
"query" : {
"bool" : {
"should" : [
{
"match" : {
"names.nameFields.value" : {
"query" : "tania silva",
"operator" : "OR",
"fuzziness" : "AUTO",
"prefix_length" : 0,
"max_expansions" : 50,
"minimum_should_match" : "1",
"fuzzy_transpositions" : false,
"lenient" : false,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"path" : "names.nameFields",
"ignore_unmapped" : false,
"score_mode" : "sum",
"boost" : 1.0
}
}
It returns 27 hits from my index. For instance, it returns a person (lets call her "A") who has the following names stored (it matches with, at least, one of these names):
- vania luzia nicioli silva
- vania luzia nicioli
- vania luzia silva
- vania luzia nicioli silva
I supose that for ElasticSearch vania is very similar to tania, so it returns the person. Ok.
My problem is that it does NOT return another person (let's call her "B") with the following name stored:
- vania silva
IMHO, this last name of "B" is more similar to "tania silva" than "vania luzia silva" of "A".
I have been playing around with the query, and if I change the searched name to "bania silva" or "wania silva" or "fania silva" or "cania silva" it DOES returns "A" and also "B". That happens changing the "v" with any other character at the beggining of vania, except "m", "s" or "t".
It also finds me "B" if I change max_expansions to 60.
Can anyone, please, explain me this behaviour?
Thanks, best regards.