Unexpected Behavior with ICU Collation Keyword Sorting

Hello,

I am experiencing unexpected behavior with the sorting order of documents in Elasticsearch using the icu_collation_keyword field type. Here are the details:

Steps to Reproduce:

  1. Create the Index with Mappings:
    PUT /test-index
    {
    "mappings": {
    "properties": {
    "id422": {
    "type": "text",
    "fields": {
    "collated": {
    "type": "icu_collation_keyword",
    "strength": "tertiary",
    "case_level": true
    }
    }
    }
    }
    }
    }

  2. Index the Documents:
    POST /test-index/_doc/1
    {
    "id422": "0a11"
    }

POST /test-index/_doc/2
{
"id422": "0A11"
}

POST /test-index/_doc/3
{
"id422": "0b11"
}

POST /test-index/_doc/4
{
"id422": "0B11"
}

POST /test-index/_doc/5
{
"id422": "0c11"
}

POST /test-index/_doc/6
{
"id422": "0C11"
}

  1. Search and Sort:

GET /test-index/_search
{
"sort": [
{
"id422.collated": {
"order": "asc"
}
}
],
"_source": ["id422"]
}

Expected Sort Order:

  1. 0A11
  2. 0B11
  3. 0C11
  4. 0a11
  5. 0b11
  6. 0c11

Actual Sort Order:

The response includes unexpected characters in the sort field, and the order does not match the expected case-sensitive sorting.

Response:

Sort order
0a11
0A11
0b11
0B11
0c11
0C11

The sort fields of the response contain unexpected cryptic characters like:
"sort": [
"""কՅ‡ࡀ

Additional Information:

  • Elasticsearch version: 8.15.3
  • Kibana version: 8.15.3
  • ICU Analysis plugin version: 8.15.3

Any insights or suggestions on how to resolve this issue would be greatly appreciated.

Thank you!