How to make sure doc_values of the field is currently works?


#1

Is there anyway to examine that the field is enabled doc_values. I know that if you want to enable doc_values of field, just set mapping doc_values:true.
But I'm a little bit not sure if it is actually effective.

According to this document
https://www.elastic.co/guide/en/elasticsearch/guide/1.x/doc-values.html#_enabling_doc_values

Doc values can be enabled for numeric, date, Boolean, binary, and geo-point fields, and for not_analyzed string fields.

If this is say if I have a filed and mapping goes like

  "clientMac":{
     "type":"string",
     "analyzer":"keyword",
     "doc_values":true
  }

Since it is not "not_analyzed", will this field doc_values enabled ?

When and what time that Elasticsearch will determine the filed should doc_values enabled or not? (in source code level).

Your input is highly appreciated.


(Mark Walkom) #2

Only because you also have "doc_values":true, as in 1.X doc values are not enabled by default.


#3

Ok, I see, but my question is: how to check this is enabled and works correctly?

In my example, does it really works on field that set analyzer "keyword" or any custom analyzer?

Will elasticsearch ignore those field that not applied for doc values, where to see these log messages that has beend ignored by elasticsearch?

Thanks for your advice.


(Mark Walkom) #4

If it's mapped it's working. You can check with _cat/fielddata as well.

Keyword for 5.X, and only if the custom analyser includes not_analyzed.

It doesn't ignore them, it just does what is expected and doesn't create doc values for them. It is not logged because it's expected behaviour.


#5

Thank you.
In your post, you said that " only if the custom analyser includes not_analyzed."
Do you mean if the custom analyzer won't work like following?
==Field==
"clientMac":{
"type":"string",
"analyzer":"lower_case_keyword",
"doc_values":true
}

==Custom Analyzer==

     "lower_case_keyword":{
        "type":"custom",
        "tokenizer":"keyword",
        "filter": [
           "lowercase"
        ]
     }

Will my field "clientMac" doc_values enable or not?

By the way, I see a document here
https://www.elastic.co/guide/en/elasticsearch/reference/1.7/analysis-keyword-analyzer.html

Seems v1.7 support this keyword analyzer, if I change above field mapping to
"clientMac":{
"type":"string",
"analyzer":"keyword",
"doc_values":true
}

Will make any difference?


#6

ok, I doing a quick test. My Elasticsearch is 1.7.2

==field mapping==
"clientMac":{
"type":"string",
"analyzer":"keyword",
"doc_values":true
}

This is actually will cause exception that Elasticsearch won't work
==>org.elasticsearch.index.mapper.MapperParsingException: Field [clientMac] cannot be analyzed and have doc values

So my conclusion is unless you specifically set doc_values:true, and make index:"not_analyzed" on string field(or numeric / boolean / date), you will not get doc_values enabled.

Although this is obviously documented here:

https://www.elastic.co/guide/en/elasticsearch/guide/1.x/doc-values.html#_enabling_doc_values


(system) #7