Hi,
I am having an issue with the match_phrase
query against the field with the edge_ngram tokenizer and asciifolding filter (Elasticsearch v7.7).
The field is:
"mainTitle": {
"type": "text",
"fields": {
"autocomplete": {
"type": "text",
"analyzer": "custom_lowercase"
}
...
"analyzer": {
"custom_lowercase": {
"filter": [
"lowercase",
"asciifolding"
],
"type": "custom",
"tokenizer": "autocomplete"
}
},
"tokenizer": {
"autocomplete": {
"token_chars": [
"letter",
"digit"
],
"min_gram": "2",
"type": "edge_ngram",
"max_gram": "10"
}
}
_analyze
with
{
"tokenizer" : "autocomplete",
"filter" : ["lowercase", "asciifolding"],
"text" : "Pokémon"
}
generates correct tokens I believe:
{
"tokens": [
{
"token": "po",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 0
},
{
"token": "pok",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 1
},
{
"token": "poke",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 2
},
{
"token": "pokem",
"start_offset": 0,
"end_offset": 5,
"type": "word",
"position": 3
},
{
"token": "pokemo",
"start_offset": 0,
"end_offset": 6,
"type": "word",
"position": 4
},
{
"token": "pokemon",
"start_offset": 0,
"end_offset": 7,
"type": "word",
"position": 5
}
]
}
However, when I perform the query:
{
"match_phrase": {
"mainTitle.autocomplete": {
"query": "poké"
}
}
}
I don't get a document with Pokémon
title. Actually, no matter if I search for poké
or poke
, the document is not returned, but if I search for pok
it is returned.
Please advice, thank you.