Case-insensitive sort

Phil_Hagelberg_2 · May 26, 2010, 2:55am

I'm trying to get some fields to perform case-insensitive sort.

{"index":
{"analysis":
{"analyzer":
{"text":
{"tokenizer": "standard",
"filter":["standard","lowercase"]},
"sortable":
{"tokenizer":"keyword",
"filter":["lowercase"]}}}}}

If I set my field to {"index": "not_analyzed", "analyzer":
"sortable"}, then it sorts in a case-sensitive manner. But if I drop
the "index" setting and use the default ("analyzed"), then it
correctly sorts case-insensitively. This is bewildering to me because
all other lucene-based systems I've worked with have warned that if a
field is analyzed, it can't be used for sorting at all.

So how is it that Elastic Search is able to get around this
limitation? And why does it break when I set it to "not_analyzed"?

thanks,
Phil

Phil_Hagelberg_2 · May 26, 2010, 3:13am

If I set my field to {"index": "not_analyzed", "analyzer":
"sortable"}, then it sorts in a case-sensitive manner. But if I drop
the "index" setting and use the default ("analyzed"), then it
correctly sorts case-insensitively. This is bewildering to me because
all other lucene-based systems I've worked with have warned that if a
field is analyzed, it can't be used for sorting at all.

So I think I might actually have figured it out--since I'm telling it
to use an analyzer which performs no tokenization, it's still able to
perform the sorting. But telling it "no analysis" also means "no
filtering", which means my lowercasing isn't applied. IOW it's not
analysis that interferes with sorting in the first place.

Is that correct?

-Phil

kimchy · May 26, 2010, 6:31am

Yea, when you set not_analyzed, then the it won't apply an analyzer on it,
regardless which one you configure on it. In this case, you can set it to
analyzed, and keep the sortable analyzer to use.

On Wed, May 26, 2010 at 6:13 AM, Phil Hagelberg phil@hagelb.org wrote:

If I set my field to {"index": "not_analyzed", "analyzer":
"sortable"}, then it sorts in a case-sensitive manner. But if I drop
the "index" setting and use the default ("analyzed"), then it
correctly sorts case-insensitively. This is bewildering to me because
all other lucene-based systems I've worked with have warned that if a
field is analyzed, it can't be used for sorting at all.

So I think I might actually have figured it out--since I'm telling it
to use an analyzer which performs no tokenization, it's still able to
perform the sorting. But telling it "no analysis" also means "no
filtering", which means my lowercasing isn't applied. IOW it's not
analysis that interferes with sorting in the first place.

Is that correct?

-Phil

Topic		Replies	Views
Case insensitive sort doesn't work Elasticsearch	9	18632	October 9, 2018
Case Insensitive Sort on a Keyword Field in 5.x Elasticsearch	2	5429	January 6, 2017
Case insensitive sorting using normalizer Elasticsearch language-clients	4	7400	February 10, 2020
Case insensitive search on not analyzed fields Elasticsearch	3	2115	July 5, 2017
How to do case insensitive sort in ES 5.1+? Elasticsearch	2	2905	January 17, 2017

Case-insensitive sort

Related topics