I want to add misspelling control in my match query. For that reason I added fuzziness as below but this totally changed the expected results when I don't do fuzziness.
I am using mapping and analyzers as below
{
"state": "open",
"settings": {
"index": {
"creation_date": "1457443337681",
"analysis": {
"filter": {
"my_edge_ngram_analyzer": {
"type": "edgeNGram",
"min_gram": "2",
"max_gram": "10"
},
"my_word_delimiter": {
"catenate_all": "true",
"type": "word_delimiter"
}
},
"analyzer": {
"my_analyzer": {
"filter": [
"standard"
,
"lowercase"
,
"my_word_delimiter"
,
"my_edge_ngram_analyzer"
],
"type": "custom",
"tokenizer": "whitespace"
}
}
},
"number_of_shards": "5",
"number_of_replicas": "1",
"version": {
"created": "2020099"
}
}
},
"Name": {
"search_analyzer": "standard",
"analyzer": "my_analyzer",
"type": "string"
},
"ShortDescription": {
"search_analyzer": "standard",
"analyzer": "my_analyzer",
"type": "string"
}
}
},
Here how it looks like without fuzziness.
{
   "query": {
      "bool": {
         "should": [
            {
               "multi_match": {
                  "type": "best_fields",
                  "query": "hp 301",
                  "fields": [
                     "Name^7",
                     "ShortDescription^6"
                  ]
               }
            }
         ]
      }
   }
}
as expected this query will return me most relevant results for hp 301
 "_source": {
               "id": 1,
               "Name": "l HP CH561EE / 301 Black",
               "ShortDescription": "301  
   "_source": {
               "id": 2,
               "Name": " HP E5Y87EE / 301 Set (2 x Black)",
              "ShortDescription": "301  
I am expecting the same results when I use fuzziness. as I understand fuzziness should only fix misspellings but not change the query results.
If I use fuzziness as AUTO with prefix_length 0, I get results as
{
   "query": {
      "bool": {
         "should": [
            {
               "multi_match": {
                  "type": "best_fields",
                  "query": "hp 301",
                  "fuzziness":"AUTO",
                  "prefix_length":0,
                  "fields": [
                     "Name^7",
                     "ShortDescription^6"
                  ]
               }
            }
         ]
      }
   }
}
Below results is totally irrelevant. only HP is the both fields. How does it get highest score?
     "_source": {
           "id": 123,
           "Name": "HP CE411A / 305A Cyan",
           "ShortDescription": "305A",
   "_source": {
           "id": 1234,
           "Name": "HP CC530A bis CC533A Set",
           "ShortDescription": "304A",
More dramatic is that when I use fuzziness as 2 instead of AUTO, I get results as makes no sense. Why would I get 2nd one which has neither hp nor 301.
  "_source": {
               "id": 345,
               "Name": "Utax 4401410015 Black",      
               "ShortDescription": "LP3014",
 "_source": {
               "id": 3400,
               "Name": "Konica Minolta 8936-404 / EP302B Black",      
               "ShortDescription": "EP302B",
Further when I use "fuzziness":2, "prefix_length":1 in the same query, I am getting different results
 "_source": {
               "id": 778,
               "Name": "593-10122 / HG308 Yellow",             
               "ShortDescription": "HG308",
"fuzziness":"AUTO", "prefix_length":1 has also different results,
  "_source": {
               "id": 8990,
               "Name": "C 13 S0 53021 / 3021",         
               "ShortDescription": "3021",
Can somebody explain me what am I doing wrong? Do I not understand fuzziness correctly?