WIldcard case insensitive query string

Hello everyone,

I'm not sure if this is how the following is intended to work but with an index with the following settings and mapping:

{
    "settings": {
        "analysis": {
                "analyzer": {
                    "myanalyzer": {
                            "type":      "custom",
                            "tokenizer": "whitespace",
                            "filter": [
                            "lowercase",
                            "asciifolding"
                            ]
                    }
                }
        }
    },
    "mappings":{
        "mytype":{

            "_all": {
                "type": "text",
                "index": "analyzed",
                "analyzer": "myanalyzer",
                "search_analyzer": "myanalyzer"
            },

            "properties":{
                "post": {
                    "type": "text",
                    "analyzer": "myanalyzer",
                    "term_vector": "with_positions_offsets",
                    "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                    }
                }
            },

            "dynamic_templates": [
                { "en": {
                        "match":              "*",
                        "match_mapping_type": "text",
                        "mapping": {
                            "type":           "text",
                            "analyzer":       "myanalyzer",
                            "search_analyzer": "myanalyzer"
                        }
                }}
            ]

        }
    }
}

I'm trying to search for the following:

*TLM

Which does not return any results, however, the following does:

*tlm

Using any other case combination doesn't return any results either.

This is what I used for testing this scenario:

POST /myindex/_search
{
    "explain": true, 
    "query": {
       "bool":{
          "filter":{
             "range":{
                "addDate":{
                   "from":"2015-01-16T01:55:15+02:00",
                   "include_lower":true,
                   "include_upper":true,
                   "to":"2017-03-15T01:55:15+02:00"
                }
             }
          },
          "must":{
             "query_string":{
                "query":"*tlM",
                "allow_leading_wildcard": "true",
                "default_operator": "AND"
             }
          }
       }
    },
    "size": 1
}

Any help would be greatly appreciated!

I'm having the same problem as you. The only difference is that I'm using Keyword type instead of Text.
The only solution I've found until now was adding a new field with the string in lower case that I send with the other document data to elasticsearch.

Hello Fabio,

Thanks for the reply, I'll keep it in mind in case the current way we went with it doesn't work out

At the moment we've set all the text that goes into the Query Parser to be lowercase prior to it reaching ES. It seems to do the trick as well without having to add additional fields or modify the mapping in any way.

Scratch that, lowercasing things before sending them to ES prevents some use-cases such as searching for data in a specific field.

Does your way work in those scenarios ?

For example:

myField:*TLM

I've solved the problem using a normalizer.

Bellow is the mapping of my index, where I've just added a "normalizer" in order to allow queries case insensitive. Also, I'm ignoring accented characters on search on my mapping.

PUT test
{  "settings": {
    "analysis": {
      "analyzer": {
        "folding": {
          "tokenizer": "standard",
          "filter":  [ "lowercase", "asciifolding" ]
        }
      },
      "normalizer": {
        "lowerasciinormalizer": {
          "type": "custom",
          "filter":  [ "lowercase", "asciifolding" ]
        }
      }
    }
  },
  "mappings": {
    "_default_": {
      "dynamic_templates": [
        {
          "string_as_keyword": {
            "match_mapping_type": "string",
            "match":   "*_k",
            "mapping": {
              "type": "keyword",
              "normalizer": "lowerasciinormalizer"                              
            }
          }
         }
      ]
    }
  }
}

PUT test/1/123
{
    "str_k" : "string âgáÈÒU is cool"
}

GET test/_search
{
  "query": {
    "wildcard": {
      "str_k": "*agaeou*"
    }
  }
}
3 Likes

Thank you so much. I was blocked since quite a long time on the case insensitive sorting so your example is very helpful.

Cheers

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.