Deactivate _source and doc_values, but can still be queried


#1

Hello everyone,

I am new in working with elasticsearch and found a behaviour which i didn't understand.

If I create an index template with the type setting "_source": { "enabled" : false}
and the field mapping "doc_values": false, the field with the deactivated doc_values can still be queried.

In my opinion, the information of the field is saved in the datatypes _source and doc_values.

Is there another datatype where the Information is saved, too?

Thank you for your help =)


(David Pilato) #2

So you are looking for not indexing a field, right?

Set index: no. See https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-index.html


#3

Thanks for the fast reply. But sadly i don't look for that.

I study Elasticsearch for my Bachelor Thesis and want to look inside the architecture.
I want to understand, why i can search for this field even though _source and doc_values is deactivated.

Maybe there's something cached or another datatype in the Lucene Index deep in the architecture?


(David Pilato) #4

Yes. You can search in a field when the field is indexed. Which is the default behavior.


#5

Do you mean indexed in the inverted Index ?

I thought only strings will be stored in the inverted Index and digits (integers, longs) are not stored in this datatype.

Am i wrong?


(David Pilato) #6

Not sure about what you mean. May be others can comment.

But basically, if you need to:

  • search, you have to index the field (whatever his type)
  • compute aggregations, you need to have doc values
  • get the original field content (either store the field or get the content from the default _source field).

#7

Okay,

point 2 and 3 are absolutely clear for me.

I understood, that we have in one shard (lucene index) three datatypes:

  • the inverted index (There are the terms of a string with the location and document, which contains this term)
  • the _source (Here you can find the original JSON-Form of the document)
  • doc_values (Original data of a document in a column-oriented fashion)

And i thought that the inverted index is only for strings and not for the other types. Every example in the internet is without a integer or something else.

See: https://www.elastic.co/guide/en/elasticsearch/guide/current/inverted-index.html

And the precise Question is now: When i deactivate _source and doc_values for a field with an integer (for instance the field "bytes"), is it also stored in the "inverted index"?

If this is true, it makes sense that i can filter documents with a specific number of "bytes" without _source and doc_value.


(David Pilato) #8

Yes. Not stored but indexed in the inverted index to be exact.


#9

Ahh okay =) Thanks a lot! :wink:

That's not mentioned in the documentation of elastic, precisely.

Thanks and have a nice day! =)


(system) #10