Keyword datatype and analysis

knights · November 17, 2016, 9:54am

Hello, everyone

I am currently testing elasticsearch 5.0. That version should, as far as I understand, solve and old problem of mine : allowing doc values to be used on keyword-analyzed fields.
This is great, because I have absolutely no power over the json pushed via elasticsearch (actually, tempering with the source in any way, even changing the casing, would have legal repercussions)

Problem is, I make a lot of aggregations on fields that can be either given to me in lowercase, or in uppercase, so, "index" : "no" is not a viable solution.

Enter the "keyword" datatype. In my template, I built an analyser : key_lowercase

"analysis" : {
"analyzer" : {
"key_lowercase" : {
"tokenizer" : "keyword",
"filter" : "lowercase"
}
},

And I try to use it such :

"my_field" : {
"type" : "keyword",
"analyzer" : "key_lowercase"
},

In my understanding, this should work, the keyword data type having been created with that in mind.

However, when I try to push my template, I get this error :

{
"type" : "mapper_parsing_exception",
"reason" : "Faled to parse mapping [my_type]: Mapping definition for [my_field] has unsuported parameters : [analyzer : key_lowercase]
}

What did I not understand?
How should I proceed to index those field in lowercase?
Thanks a lot

dadoonet · November 17, 2016, 10:55am

This will be possible in 5.1. For now you have to workaround and use ingest lowercase processor.

knights · November 17, 2016, 12:15pm

Thanks for the reply,

Ah, ok, my bad .. I thought it was already possible. I'll wait for the 5.1, then.

About the ingest lowercase processor, that could do the trick, if it weren(t for 2 "problems" :
Ingest seems to modify the source before indexing (If I get it right) , and we have, in all our mappings 250+ fields analysed with that key_lowercase. (that seem to be a lot of work to turn around an issue that will be solved in the near future)

Anyway, I can wait a few months. Thank s again for the reply.

dadoonet · November 17, 2016, 1:42pm

I hope it will be released sooner than in months.

No precise date though.

knights · December 9, 2016, 2:08pm

Hello again !

I come back to you regarding the Elasticsearch 5.1.1 that got released.
Nowhere in the docs/release notes does it show that using "keyword-type" analysis while using doc_values on the keyword datatype is now possible.

Has it been postponed? Or outright cancelled?
I'd be thankful for any infos.

dadoonet · December 9, 2016, 2:27pm

It is still opened

We thought it could be in 5.1 but it has been delayed.
PR pending here:

system · January 6, 2017, 2:27pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Best practice of case insensitive keyword mapping in ES 5.x Elasticsearch	5	15450	March 8, 2017
ES 5.0 - case insensitive search for keyword fields Elasticsearch	11	11750	July 5, 2017
Case Insensitive Sort on a Keyword Field in 5.x Elasticsearch	2	5429	January 6, 2017
Template wildcard data type, normalizer/analyzer for case insensitive search Elasticsearch	2	882	September 17, 2020
Case insensitive search and doc_values Elasticsearch	3	1273	July 5, 2017

Keyword datatype and analysis

Related topics