Encoding Issues: Term Filter not working with texts with Diacritics


(Dadepo) #1

I do not get any result if I do a filter term search on a field which holds text with diacritics. But If I do a filter term search on the same field but searching for a text that does not contain any Diacritics/Accents, I do get results.

For instance if I have the following document:

{
name: "funke"
}

and I do a search

"filter": {
     "term": {
        "name" : "funke"
    }
}

I do get a result. But if I have the following document

{
name: "Bọ́lánlé" // note the value contains accents
}

and I search thus:

"filter": {
"term": {
"name" : "Bọ́lánlé"
}
}

I get zero hits!
What do I need to do to make my filter term work with texts with special characters?


(Luca Cavanna) #2

This seems to depend on the text analysis that you apply to your documents and fields. You can have a look at the analyze api to see what happens to your field content once you index the document, it most likely gets normalized etc. while that doesn't happen when using the term filter or query, as that one doesn't support text analysis. I would say move to match query, which supports text analysis and will by default pick the analyzer that was used at index time for the field that you are querying.


(system) #3