Correct Sizing and compression on Elasticsearch

xziten · November 13, 2020, 1:45pm

Hi,

I am trying to get some ratio about Elasticsearch cluster. I have this base :

**For 100 Gb of raw data / day  :**

* 110 Gb of data are indexed (~ +10% for datatype)
* 220 Gb with 1 replica 
* 6600 Gb per month
* +15% disk space to avoid saturation (7590 Gb)

<b>Disk space with Hot/Warm architecture:</b>

* Hot Data Node : Disk/RAM Ratio 30:1
* Warm Data Node : Disk/RAM Ratio 100:1
* 3 Masters Nodes with limited sizing

The fact is I think that I am wrong. That's a lot of Gb for a database. Does Elastic apply any compression ? Do u have some ratio to give me.

Thanks

abrx · November 13, 2020, 2:53pm

Hi,
Not answering the question, but don't forget to force merge indices (with flush enabled, in 1 segment) that are read-only to add some optimization in size (flush really delete deleted documents) and query time (less Lucene segment to look on)

xziten · November 13, 2020, 8:16pm

Hi, thank you for your answer, I will notice it

BenB196 · November 13, 2020, 8:30pm

A few things that you also might want to look into:

Index setting: codec, you can set to best_compression which compresses the _source more, saving additional space. https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html
For historical stuff, you can look into searchable snapshots which were added in 7.10. This allows you to have offload the replica to an S3 bucket, and in the event the primary shard goes down, Elasticsearch will automatically start a restore of the snapshot to an available node. This saves on having to keep replicas for older data. https://www.elastic.co/blog/introducing-elasticsearch-searchable-snapshots

Christian_Dahlqvist · November 13, 2020, 8:54pm

I would recommend reading this section in the documentation.

system · December 11, 2020, 8:54pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Compression Mechanism Elasticsearch	4	397	July 6, 2017
ElasticSerach Sizing Elasticsearch	6	2045	August 19, 2019
ElasticSearch and hadoop gateway local disk space on elastic search Elasticsearch	1	528	July 6, 2017
ElasticSearch Hot-warm-Cold Elasticsearch	2	421	March 30, 2020
ELK stack questions Elasticsearch	4	2119	July 5, 2017

Correct Sizing and compression on Elasticsearch

Related topics