Keyword datatype and analysis


#1

Hello, everyone

I am currently testing elasticsearch 5.0. That version should, as far as I understand, solve and old problem of mine : allowing doc values to be used on keyword-analyzed fields.
This is great, because I have absolutely no power over the json pushed via elasticsearch (actually, tempering with the source in any way, even changing the casing, would have legal repercussions)

Problem is, I make a lot of aggregations on fields that can be either given to me in lowercase, or in uppercase, so, "index" : "no" is not a viable solution.

Enter the "keyword" datatype. In my template, I built an analyser : key_lowercase

"analysis" : {
"analyzer" : {
"key_lowercase" : {
"tokenizer" : "keyword",
"filter" : "lowercase"
}
},

And I try to use it such :

"my_field" : {
"type" : "keyword",
"analyzer" : "key_lowercase"
},

In my understanding, this should work, the keyword data type having been created with that in mind.

However, when I try to push my template, I get this error :

{
"type" : "mapper_parsing_exception",
"reason" : "Faled to parse mapping [my_type]: Mapping definition for [my_field] has unsuported parameters : [analyzer : key_lowercase]
}

What did I not understand?
How should I proceed to index those field in lowercase?
Thanks a lot


(David Pilato) #2

This will be possible in 5.1. For now you have to workaround and use ingest lowercase processor.


#3

Thanks for the reply,

Ah, ok, my bad .. I thought it was already possible. I'll wait for the 5.1, then.

About the ingest lowercase processor, that could do the trick, if it weren(t for 2 "problems" :
Ingest seems to modify the source before indexing (If I get it right) , and we have, in all our mappings 250+ fields analysed with that key_lowercase. (that seem to be a lot of work to turn around an issue that will be solved in the near future)

Anyway, I can wait a few months. Thank s again for the reply.


(David Pilato) #4

I hope it will be released sooner than in months. :slight_smile:

No precise date though.


#5

Hello again !

I come back to you regarding the Elasticsearch 5.1.1 that got released.
Nowhere in the docs/release notes does it show that using "keyword-type" analysis while using doc_values on the keyword datatype is now possible.

Has it been postponed? Or outright cancelled?
I'd be thankful for any infos.


(David Pilato) #6

It is still opened

We thought it could be in 5.1 but it has been delayed.
PR pending here:


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.