Elasticsearch storage usage

Alexis_Okuwa · December 26, 2012, 8:33am

I am working on an application where I am using elasticsearch as the
primary database, I have been running some inserts on an index and seeing
3x database storage increases over the raw files. I am not sure if this is
normally of a full text search system, Is there anything i can do to make
some things less searchable or able to keep the datasize smaller.

--

Artem_Grinblat · December 26, 2012, 12:37pm

Lucene indexes are a form of an append-only database: new data is inserted
into a new file, then after certain conditions are met the index is
compacted by merging several files together into yet another file while
skipping the removed entries. That means there are old versions of data
lying around until compaction. I think you can tune the ES to do the Lucene
compactions more often.
There's also index compression, cf.
https://groups.google.com/d/msg/elasticsearch/3hNu6GPd4pE/I0Kcm_inyHQJ
And of course the inverted indexes have some overhead of their own as they
need to store document ids and statistics for every term.

Just my two cents.

среда, 26 декабря 2012 г., 12:33:11 UTC+4 пользователь Wojons Tech написал:

I am working on an application where I am using elasticsearch as the
primary database, I have been running some inserts on an index and seeing
3x database storage increases over the raw files. I am not sure if this is
normally of a full text search system, Is there anything i can do to make
some things less searchable or able to keep the datasize smaller.

--

Artem_Grinblat · December 26, 2012, 12:41pm

AFAIK, Lucene has one of the most compact index formats. Other full-text
search engines usually have bigger indexes. See, for example,
http://taschenorakel.de/mathias/2012/04/18/fulltext-search-benchmarks/

I am not sure if this is normally of a full text search system

--

Alexis_Okuwa · December 26, 2012, 1:18pm

Okay that makes sense i am seeing the documents size rati going down its
much closer now to using twice as much space and not more than that.

On Wednesday, December 26, 2012 4:41:56 AM UTC-8, Artem Grinblat wrote:

AFAIK, Lucene has one of the most compact index formats. Other full-text
search engines usually have bigger indexes. See, for example,
http://taschenorakel.de/mathias/2012/04/18/fulltext-search-benchmarks/

I am not sure if this is normally of a full text search system

--

otisg · December 27, 2012, 4:07pm

Hi,

You also probably have _source enabled and maaay have individual fields
marked as stored and maybe you also have _all?

Otis

ELASTICSEARCH Performance Monitoring - Sematext Monitoring | Infrastructure Monitoring Service
Search Analytics - Cloud Monitoring Tools & Services | Sematext

On Wednesday, December 26, 2012 3:33:11 AM UTC-5, Wojons Tech wrote:

I am working on an application where I am using elasticsearch as the
primary database, I have been running some inserts on an index and seeing
3x database storage increases over the raw files. I am not sure if this is
normally of a full text search system, Is there anything i can do to make
some things less searchable or able to keep the datasize smaller.

--

Topic		Replies	Views
What storage engine does elastic search uses? Elasticsearch	2	3002	July 5, 2017
Questions relating to elastic search Elasticsearch	3	925	July 6, 2017
Elasticsearch index size is less than it takes on disk Elasticsearch	4	2800	December 7, 2017
Disk usage on benchmarks Elasticsearch	2	254	August 13, 2021
Index data on disk between versions Elasticsearch	7	354	May 31, 2021

Elasticsearch storage usage

Otis

Related topics