Match queries and ASCII folding

Hi all,

I have an index that was created with the following configuration:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "std_asciifolding": {
          "tokenizer": "standard",
          "filter": [ "std_asciifold_preserve", "lowercase" ]
        }
      },
      "filter": {
        "std_asciifold_preserve": {
          "type": "asciifolding",
          "preserve_original": true
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "display_name": {
        "type": "text",
        "analyzer": "std_asciifolding"
      }
    }
  }
}

The idea is that the field display_name is searchable in a case- and diacritics-insensitive, to facilitate searching through non-English names.

One of the documents in this index looks like this:

{
    "user_id": "XXXXXXX",
    "display_name": "Gáo foo",
    "avatar_url": null
}

Where the character is a lowercase a followed by a U+0301 (Combining Acute Accent).

When I try searching for this user in my index, the following query pulls up the document:

{
    "query": {
        "match_phrase_prefix": {
            "display_name": {
                "query": "Ga"
            }
        }
    }
}

However, the following query does not:

{
    "query": {
        "match_phrase_prefix": {
            "display_name": {
                "query": "Gao"
            }
        }
    }
}

This leads to confusing results where at first it looks like I can search for my user without the combining acute accent, but as I finish typing their display name it's suddenly not there anymore.

Am I missing something here?

It's worth noting that the filter seems to be working as expected for other results, e.g. if I have the following document in my index:

{
    "user_id": "XXXXXXX",
    "display_name": "Léonard foo",
    "avatar_url": null
}

Then the following search query returns my result:

{
    "query": {
        "match_phrase_prefix": {
            "display_name": {
                "query": "Leo"
            }
        }
    }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.