Best practice of case insensitive keyword mapping in ES 5.x

Youxu · February 6, 2017, 7:10am

In my old ES 2.x index mapping, I had a custom analyzer to support case insensitive keyword search:

{
  "settings": {
      "analyzer": {
        "lowercase_keyword": {
          "type": "custom",
          "tokenizer": "keyword",
          "filter": "lowercase"
        }
      }
  },
  "mappings": {
    "type": {
      "properties": {
        "city": {
          "type": "string",
          "analyzer": "lowercase_keyword"
        }
      }
    }
  }
}

Now in ES 5.x, string is replaced with "text" and "keyword", then I have two options to implement case insensitive mapping.

Use same lowercase_keyword analyzer approach as in ES 2.x, but change "string" to "text" in field mapping
Use new "keyword" type with new normalizer concept as follows:

{
  "settings": {
    "analysis": {
      "normalizer": {
        "lowercase_normalizer": {
          "type": "custom",
          "char_filter": [],
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "type": {
      "properties": {
        "city": {
          "type": "keyword",
          "normalizer": "lowercase_normalizer"
        }
      }
    }
  }
}

Which one is better? From functionality point of view, I think no difference. I am wondering whether there is performance difference between these two approaches?

Mark_Harwood · February 6, 2017, 11:02am

There should be no difference in terms of search.

Aggregations is a different story as the text field would have to use heap-based FieldData (disabled by default) whereas keyword would use disk-based DocValues (generally recommended for analytic use cases).

Youxu · February 7, 2017, 12:51am

So, in ES 5.x, no difference on both search and aggregation, since %S 5.x by default use doc value for text field, right?

Mark_Harwood · February 7, 2017, 9:26am

Untrue. DocValues are the default for keyword fields but are not supported for text

Youxu · February 8, 2017, 1:12am

Thanks clarification!

system · March 8, 2017, 1:12am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES 5.0 - case insensitive search for keyword fields Elasticsearch	11	11819	July 5, 2017
Exact match with case insensitivity Elasticsearch	9	40044	August 22, 2017
Case insensitive search and doc_values Elasticsearch	3	1312	July 5, 2017
Case Insensitive Sort on a Keyword Field in 5.x Elasticsearch	2	5447	January 6, 2017
Keyword datatype and analysis Elasticsearch	6	794	January 6, 2017

Best practice of case insensitive keyword mapping in ES 5.x

Related topics