How does Elasticsearch indexes non-text fields

ImranArif · August 27, 2022, 10:46pm

Hi,

I understand that Elasticsearch analyzes text fields and saves the resulting tokens in an inverted index data structure. But I am a bit confused regarding how does Elasticsearch stores other fields, like integer, float, and keyword? Does it treat values of these fields as tokens (without breaking further into tokens) and stores those values in an inverted index as it is? Or does it store those values in a separate data structure?

Thanks.

stephenb · August 28, 2022, 1:24am

Perhaps take a read through this.

keyword, numerics date etc are not tokenized.. only text fields are tokenized

keywords are stored in the inverted index and doc_values as well

Numerics are also stored in the inverted index but with some other meta data to support range searches etc.

ImranArif · August 28, 2022, 3:40am

Thanks @stephenb

So this means that numeric and keyword types both have doc_values enabled by default, in order to optimize and support aggregation, sorting and lookup on those fields. But with numeric types there is an additional metadata stored, which is specifically to make range queries more efficient.

Please let me know if my understanding is correct.

stephenb · August 28, 2022, 4:07am

Yes, that's generally correct, of course there's a lot of low level detail.

If there's something specific you're trying to solve, perhaps you should open a thread with the specific issue you are trying to solve.

ImranArif · August 28, 2022, 7:32am

Thanks much @stephenb
I am not solving any problem at the moment. I am trying to understand ELK stack.

system · September 25, 2022, 7:33am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How are numeric data types indexed? Elasticsearch	3	1472	July 23, 2019
Waht are the possible datatypes of fileds of an index like keyword, ip , date nanos etc Elasticsearch	9	299	November 29, 2021
Keyword, doc_value and analysis Elasticsearch	3	1460	September 22, 2019
Are ENUM stored efficiently? Elasticsearch	4	6934	July 5, 2017
Any issue store integer values in text field Elasticsearch	7	2321	March 10, 2022

How does Elasticsearch indexes non-text fields

Related topics