Regexp and case insensitive

rore · May 26, 2015, 8:15am

As far as I know there is no option to provide a case insensitive option for regexp query or filter.

Is there a good reason for that?

This means that in order to provide case insensitive regex searches you need to have a multi field, once with the original term and once lower cased.

This is quite an overhead when having a lot of documents or fields. If the lower case regex search is something that doesn't happen often, it can be better to "pay" the runtime CPU overhead rather than always sustain the indexing overhead of keeping multiple indexes of the field.

adi · November 17, 2015, 11:23am

Hi Rotem,

Did you find any solution which is not multi-field or something like [Bb][Ll][Aa][Bb][Ll][Aa]?

nik9000 · November 17, 2015, 1:24pm

The trouble is that Lucene regexes don't have the option to support case insensitive searching. I'd cobbled together something mostly works in wikimedia-extra's source_regex filter. Its by no means perfect or efficient at all or even right in some cases. And it doesn't work like the regular regex search either so its not a standin for what you are doing. So I can't really suggest that you use it, its more like a case study in why its hard.

Topic		Replies	Views
Regexp query case-insensitive flag Elasticsearch	1	2111	March 7, 2014
How to get the case insensitive results from a regex query Elasticsearch	3	6481	September 13, 2016
Case-Insensitive regex-based search for text fields in ES 5.6.3 Elasticsearch	0	458	May 12, 2019
Way to perform case insensitive regexp search on keyword field Elasticsearch	0	374	August 1, 2019
Regexp not searching as expected Elasticsearch	6	571	June 6, 2020

Regexp and case insensitive

Related topics