Return Results With Diatrics

Hi All,
I'm trying to return results for names with Diacritics . I want to be able to search using regular characters and receive all results, including names with diacritics (Example: want to search for Jose and return all results with Jóse).
I have updated my settings to the following:

{
            "analysis": {
          "filter": {
            "edge_ngram_filter": {
              "type": "edgeNGram",
              "min_gram": "1",
              "max_gram": "25"
            },
             "my_ascii_folding": {
                "type": "asciifolding",
                "preserve_original": "true"
              } 
          },
          "normalizer": {
            "keyword_normalizer": {
              "filter": [
                "lowercase"
              ],
              "type": "custom",
              "char_filter": []
            }
          },
          "analyzer": {
            "input_analyzer": {
              "filter": [
                "lowercase",
                "my_ascii_folding"
              ],
              "type": "custom",
              "tokenizer": "whitespace"
            },
            "autocomplete": {
              "filter": [
                "lowercase", 
                "edge_ngram_filter"             
              ],
              "type": "custom",
              "tokenizer": "whitespace"
            },
            "email_analyzer": {
              "tokenizer": "email_tokenizer"
            }
          },
          "tokenizer": {
            "email_tokenizer": {
              "type": "uax_url_email"
            }
          }
        }
        }

The following is the sample mapping:

"lastName": {
            "type": "text",
            "fields": {
              "aggregation": {
                "type": "keyword"
              },
              "keyword": {
                "type": "keyword",
                "normalizer": "keyword_normalizer"
              }
            },
            "analyzer": "autocomplete",
            "search_analyzer": "input_analyzer"
          }

Any help is appreciated.

It looks good but you need to add the asciifolding filter to the autocomplete analyzer.

And I'd:

  • remove preserve_original
  • first do the asciifolding and then edge_gram (for performance)

If it doesn't work, could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script is something anyone can copy and paste in Kibana dev console, click on the run button to reproduce your use case. It will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.