Max data that can be stored based on memory configured

mrudrego · October 12, 2020, 10:59am

Hi,

I was going through the blog which talks about shards in a cluster https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster

In one of their tips they say " A good rule-of-thumb is to ensure you keep the number of shards per node below 20 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600 shards, but the further below this limit you can keep it the better."
So even if i assume the average size of a shards is 20GB . Can a node with 30GB heap can store 600(shards) x 20GB = 12 TB data without issues ?
Please help me understand this.

Thanks,

Christian_Dahlqvist · October 12, 2020, 11:36am

A node may hold more or less than that depending on the data, mappings and workload. These are all rough guidelines around best practices, and I created this blog post as I saw a lot of users ending up in serious trouble due to having large volumes of small shards, which for many use cases is very inefficient.

mrudrego · October 14, 2020, 10:32am

Thanks for the reply.

I was looking for dimensioning guidance for setting up a Elasticsearch Cluster( no of nodes, memory, disk etc) for say x no of indices created with size yGB, retention period of indices etc.
So any guidance on that would be very helpful.

Thanks & Regards,

Christian_Dahlqvist · October 14, 2020, 11:48am

It will depend on the use case, but assuming logs and/or metrics the following resources may help:

mrudrego · October 16, 2020, 7:22am

Thank you very much

mrudrego · October 20, 2020, 6:54am

Hi, i have a small clarification. In the link https://www.elastic.co/blog/sizing-hot-warm-architectures-for-logging-and-metrics-in-the-elasticsearch-service-on-elastic-cloud , the Disk:RAM ratio is 30:1 while the webinar on Quantitative Cluster Sizing suggests the ratio as 16:1. Is this difference because of the version of ES used or is there anything else.

Thanks,

Christian_Dahlqvist · October 20, 2020, 6:57am

This will depend a lot on the use case and how much you index per day compared to the retention period on the nodes. The webinar is quite old and that also plays a part to some extent.

system · November 17, 2020, 6:57am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Max number of shards per elasticsearch data node Elasticsearch	5	1348	June 4, 2017
ES Cluster Capacity Planning - Storage Limit Elasticsearch	3	3093	October 14, 2019
Quantitative Cluster Sizing Elasticsearch	3	891	March 23, 2020
Evidence/Benchmarking behind Shards/Datanode Recommendation Elasticsearch	2	387	August 30, 2019
Shards Elasticsearch	4	312	July 9, 2021

Max data that can be stored based on memory configured

Related topics