Why elasticsearch still check the length of a keyword field, even if it's not indexed?


(esnewbie) #1

Hi all,
I'm new to Elasticsearch, using version 5.3.

As i understand, the length of keyword field is limited to about 32K due to the limitation of Lucene's term byte-length limit of 32766.

But, the thing I don't understand is, when I set a field type to keyword, and index to false,and store by default is false, means that this field is not indexed and not stored.
When I save data that more than 32k into that field, I still get the error, says"DocValuesField "newField7" is too large, must be <= 32766".

As my question, why elasticsarch still check the length of this filed, even if I don't index nor store it?
Why not ignore it and just keep it in _source? Is there any other things that elasticearch still have to do with this field, even if it's not indexed nor stored?

I think elasticsearch do not need to store this field's data to a Lucene term, right? then why keep the limitation?


(esnewbie) #2

Can anyone help? @elastic


(David Pilato) #3

Read this and specifically the "Also be patient" part.

It's fine to answer on your own thread after 2 or 3 days (not including weekends) if you don't have an answer.


(David Pilato) #4

Have a look at https://www.elastic.co/guide/en/elasticsearch/reference/current/enabled.html


(esnewbie) #5

@dadoonet thanks for your response, and sorry for my rush reply. I will follow the code of conduct more carefully.

Back to my question.
I have already found and read the link you provided before I post, and it just not answered my question.
As the link page says:

The enabled setting, which can be applied only to the mapping type and to object fields, causes Elasticsearch to skip parsing of the contents of the field entirely.The JSON can still be retrieved from the _source field, but it is not searchable or stored in any other way.

Which not solve my problem withkeyword type field.
In fact, what I expected is when I set a keyword type field to "index": false, "store: false", Elasticsearch should treat it just like it does to a "enabled: false" object field, which means just save it to _source without checking if the length is too large, because Elasticsearch no longer need to store this field to a Lucene term.

Is there any reason that Elasticsearch still have to check the length of "index": false, "store: false" keyword field?


(David Pilato) #6

The goal of keyword data type is to be able to generate aggs on it or do sorting.
In a sense, keyword is usable. That's why the length is checked IMO.


(esnewbie) #7

OK, everybody sees different. I still expect the length should not be checked in this scenario.
Maybe I should just use object type field in my situation instead of keyword.
Thanks for your answer !


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.