Pokémon - match_phrase fails with edge_ngram & asciifolding

Hi,

I am having an issue with the match_phrase query against the field with the edge_ngram tokenizer and asciifolding filter (Elasticsearch v7.7).

The field is:

"mainTitle": {
	"type": "text",
	"fields": {
		"autocomplete": {
			"type": "text",
			"analyzer": "custom_lowercase"
		}
...
"analyzer": {
	"custom_lowercase": {
		"filter": [
			"lowercase",
			"asciifolding"
		],
		"type": "custom",
		"tokenizer": "autocomplete"
	}
},
"tokenizer": {
	"autocomplete": {
		"token_chars": [
			"letter",
			"digit"
		],
		"min_gram": "2",
		"type": "edge_ngram",
		"max_gram": "10"
	}
}

_analyze with

{
  "tokenizer" : "autocomplete",
  "filter" : ["lowercase", "asciifolding"],
  "text" : "Pokémon"
}

generates correct tokens I believe:

{
    "tokens": [
        {
            "token": "po",
            "start_offset": 0,
            "end_offset": 2,
            "type": "word",
            "position": 0
        },
        {
            "token": "pok",
            "start_offset": 0,
            "end_offset": 3,
            "type": "word",
            "position": 1
        },
        {
            "token": "poke",
            "start_offset": 0,
            "end_offset": 4,
            "type": "word",
            "position": 2
        },
        {
            "token": "pokem",
            "start_offset": 0,
            "end_offset": 5,
            "type": "word",
            "position": 3
        },
        {
            "token": "pokemo",
            "start_offset": 0,
            "end_offset": 6,
            "type": "word",
            "position": 4
        },
        {
            "token": "pokemon",
            "start_offset": 0,
            "end_offset": 7,
            "type": "word",
            "position": 5
        }
    ]
}

However, when I perform the query:

{
	"match_phrase": {
		"mainTitle.autocomplete": {
			"query": "poké"
		}
	}
}

I don't get a document with Pokémon title. Actually, no matter if I search for poké or poke, the document is not returned, but if I search for pok it is returned.

Please advice, thank you.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.