I could not fuzzy search in Elasticsearch

I want to autocomplete service via Elasticsearch.I need a "did you mean" of google style. I researched very much how can I to do. I was find a fuzzy search. When I entered to "Alman", I want to Almanya only. But I received Umman too. How can I getting only "almanya"?

    {
        "highlight": {"fields": [{"CountryName": {"type": "plain"}}]},
        "query": {
            "bool": {
                "should": [
                    {
                        "match": {
                            "CountryName": {
                                "boost": 1,
                                "fuzziness": 5,
                                "query": "Alman"
                            }
                        }
                    },
                    {
                        "match": {
                            "CountryNameTurkish": {
                                "boost": 1,
                                "fuzziness": 5,
                                "query": "Alman"
                            }
                        }
                    }
                ]
            }
        }
    }

Response

    {
        "hits": {
            "total": 2,
            "max_score": 1.9666269,
            "hits": [
                {
                    "_index": "country_codes_v1",
                    "_type": "kafka_connect",
                    "_id": "OM",
                    "_score": 1.9666269,
                    "_source": {
                        "CountryCode": "OM",
                        "CountryName": "Oman",
                        "Continent": "Middle East",
                        "CapitalCityCode": "MCT",
                        "Duplicate": 0,
                        "CountrySlug": "om-oman",
                        "CountryNameTurkish": "Umman"
                    }
                },
                {
                    "_index": "country_codes_v1",
                    "_type": "kafka_connect",
                    "_id": "DE",
                    "_score": 1.6418773,
                    "_source": {
                        "CountryCode": "DE",
                        "CountryName": "Germany",
                        "Continent": "Europe",
                        "CapitalCityCode": "BER",
                        "Duplicate": 0,
                        "CountrySlug": "de-germany",
                        "CountryNameTurkish": "Almanya"
                    }
                }
            ]
        }
    }

Hi,

You set there fuzziness to 5. Fuzziness uses Levenshtein edit distance which measures the number of single-character edits required to transform one word in another. It has four types of one-character edits:

  • Substitution of one character to another: _f_ox -> _b_ox
  • Insertion of a new character: sic -> sic_k_
  • Deletion of a character: b_l_ack -> back
  • Transposition of two adjacent characters: _st_ar -> _ts_ar

When you set fuzziness to 5 you set the edit distance to 5, and between Alman and Umman the edit distance is 2, so fuzzy search match this word. You can set the edit distance to AUTO and also you can set a prefix_length if you think it could filter your fuzzy results.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.