Cant do case insensitive search in elastic search

I'm new to Elasticsearch and trying to do this query right.
So I'm having a document like this:

{
    "id": 1,
    "name": "Văn Hiến"
}

I want to get that document in 3 cases:
1/ User input is: "v" or "h" or "i",...
2/ User input is: "Văn" or "văn" or "hiến",...
3/ User input is: "va" or "van" or "van hi",...

I'm currently can search for case 1 and 2, but not case 3, where the user input don't have the 'tonal' of the Vietnamese language
This is my query, I'm using Python:

query = {
                "bool": {
                    "should": [
                        {
                            "match": { 
                                "name": name.lower()
                            }
                        },
                        {
                            "wildcard": {
                                "name": {
                                    "value": f"*{name.lower()}*"
                                }
                            }
                        }
                    ]
                }
            }

Can anyone help me with this? Any helps will be apperciated

Hi @Dang_Hai

I would avoid using * at the beginning of patterns. Elastic recommend avoiding:

Avoid beginning patterns with * or ? . This can increase the iterations needed to find matching terms and slow search performance.

I believe you can succeed using the Edge n-gram tokenizer.

That's a question of analyzer. As @RabBit_BR suggested, you should not use wildcard query in that case but the right analyzers with a match query.

A wildcard query will compare what you entered with the inverted index.

If your analyzer has indexed Văn to văn, then searching for van won't match...

Have a look at the ascii folding token filter.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.