How to perform natural sort on a field in Elasticsearch?

I have an Elasticsearch keyword field called LocationNumber that I want to sort "alphanumerically" (or "naturally") so that it is easy for users to read. I have been researching this for days and couldn't find any real discussions or solutions.

What I want is to sort numbers and letters in a "natural" way.

This is the default way that Elasticsearch sorts in ascending order, and is NOT what I want:

0102, 1, 101, 101A, 101B, 2, B101, CC452PD452, Store102, Store102A

This is what I want when I sort in ascending order:

1, 2, 101, 0102, 101A, 101B, B101, CC452PD452, Store102, Store102A

I saw a natural sort plugin (here), but I think it's too old, and it didn't install for the latest version of Elasticsearch.

Is there a way to implement a "natural" sort either during index time or query time? I prefer an index time solution because a query time solution will consume too much memory.

I found an answer: ICU Collation Keyword Field | Elasticsearch Plugins and Integrations [7.15] | Elastic

You have to use the "analysis-icu" plugin and use the field type icu_collation_keyword as a multi-field with "numeric = true". You can then sort using this multi-field.

This relies on the Unicode Collation Algorithm.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.