Elasticsearch indexing storage mechanism

pgnrk23 · November 2, 2015, 7:36am

Hello Buddies,

I have a doubt in the elasticsearch indexing storage mechanism that is,
For example, When we are indexing the data, it occupies some space in disk storage for storing the DOC, when we apply any analyzer the space occupied varies and also it is growing in multiple times of the original data in the database.

Can anyone explain the ratio or concept behind it?

Thanks in advance.

warkolm · November 2, 2015, 7:48am

We do store the original document in _source, and then all the fields as per the analyser. So it really depends on what sort of analysis you are using.

Note that we do compress things, so that should help.

pgnrk23 · November 2, 2015, 8:03am

Thanks, Mark Walkom. (@warkolm)
I am using the following analyzer,

Shingle
Snowball
nGram
Raw.

Do you have any idea about the ratio of the original and index storage?
And also do I need to change any settings to apply the compressing that you have mentioned.

Topic		Replies	Views
Weird storage change Elasticsearch	4	605	July 20, 2017
ElasticSearch index size peculiarity Elasticsearch	2	661	July 6, 2017
Best way to analyse storage of elastic search data nodes Elasticsearch	3	345	January 22, 2019
Elasticsearch Index Analyzers and Memory Management Elasticsearch	1	358	May 14, 2019
How Elasticsearch Index (Lucene) works under the covers? Elasticsearch	2	182	April 25, 2024

Elasticsearch indexing storage mechanism

Related topics