Case_insensitive not working on wildcard field type with cyrilic data

Hello,

When you have an index with field that its type is wildcard and its filled with Cyrillic data and then when you perform wildcard query with case_insensitive: true, no documents are found.

Note: we are currently using version 7.17.8

Test example:

PUT /index
{
  "mappings": {
    "properties": {
      "name": {
        "type": "wildcard"
      }
    }
  }
}

POST /index/_doc/1
{
  "name": "ТЕСТ"
}

POST /index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "wildcard": {
            "name": {
              "value": "*Тест*",
              "case_insensitive": true
            }
          }
        }
      ]
    }
  }
}

I tried to search for a fix but could not find anything. Is there anything that can help us solve this issue?

NOTE: Index data and query data are all the Cyrillic characters.

Cyrilic:

ТЕСТ - 0xd0a2d095d0a1d0a2
Тест - 0xd0a2d0b5d181d182

While latin would be:

TECT - 0x54454354
Tect - 0x54656374

Hi Irfan,
Adding case insensitivity for all languages/scripts is a big prospect which is why the functionality is limited to the ASCII set of characters for now. See docs here

1 Like

Hey @Mark_Harwood1, thanks for the response.
Do you know when Elastic is planing to expand support to other characters?

No, sorry

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.