Request for explanation of RAM DISK ratio in ES

zidane28 · May 28, 2019, 3:26am

I came across the term of ram:disk ratio when I was watching one of the webminar, but I don't really understand on why the ratio.

Previously my understanding of ES is that, for example I have a 64GB memory data node, and I had assigned 30GB out of 64GB as JVM Heap, so now I left with 34GB of memory for Lucene.

So ideally, the total storage of index for that data node shouldn't exceed 34GB as if it exceeds, there will performance hit.

But with the ratio, for example 1:24, does it means that with 34GB of RAM, my data node can support up to 816GB of indexes? Since 34GB x 24 = 816?

Thank you.

warkolm · May 28, 2019, 3:29am

Not exactly. The non-heap memory is used by the OS to cache Lucene files. But it's not used directly by Lucene.

It'd be 30GB (being heap) x 24.

zidane28 · May 28, 2019, 3:36am

So the ratio actually is for ES JVM? Means ratio so that JVM heap won't burst from size of indexes?

warkolm · May 28, 2019, 3:37am

It's in relation to the heap size, yes.

It's not a hard limit, you can go more, or less. It's really a starting point.

Christian_Dahlqvist · May 28, 2019, 3:43am

It is basically a way to specify the storage capacity of a node in relation to the size of the node measured in RAM. This typically assumes 50% of RAM is given to heap as per best practice. The ratio is typically low for search heavy use cases as this often requires data to be cached in the OS page cache. For logging use cases it can however often be much higher as the amount of data stored on a node often is limited by the heap size.

zidane28 · May 28, 2019, 4:42am

So for example the below formula which I take from one of the webminar:

Total Data Nodes = ROUNDUP(Total Storage (GB) / Memory per data node / Memory:data ratio) + 1 Data node for failover capacity

So the memory per data node should I put the full amount of RAM or the heap size? In which max value will be 30GB.

Christian_Dahlqvist · May 28, 2019, 7:09am

Which webinar are you referring to?

zidane28 · May 28, 2019, 7:11am

elasticsearch-sizing-and-capacity-planning

Christian_Dahlqvist · May 28, 2019, 8:16am

That calculation seems correct.

I have not watched that one but would expect it to use the same convention used on Elastic Cloud. There a 16GB node indicates the amount of RAM and it has 8GB heap. If it had s disk-to-memory ratio of 10 that would mean 160GB storage.

zidane28 · May 29, 2019, 1:55am

Sorry I not quite understand the meaning here, does it mean memory per data node is actually heap memory available?

Christian_Dahlqvist · May 29, 2019, 5:52am

No, Elasticsearch needs a good amount of off-heap memory to function well which is why it is recommended to give 50% of available RAM to the heap.

martinr_ubi · May 29, 2019, 7:32am

I think @zidane28 is confused because all the literature and yourself calculate the RAM:Disk ratio with the server RAM and NOT with the heap. But @warkolm said it was the heap amount that went into this ratio.

Which puts us in a pickle I too would say RAM is the number used everywhere I ever saw this ratio concept referenced.

You can calculate a ratio with heap if you want but if people were to do that they would not be talking about the same thing when they have discussions.

Maybe Mark just made a mistake in the heat of the moment or else even I am confused following this thread?

martinr_ubi · May 29, 2019, 7:38am

To be clear the ratio can vary from like 1:8 to 1:500 depending on the usecase and always assumes ~50% of RAM is heap with max RAM of ~64GB but we still should all calculate it the same way or else it stops making any sense

Christian_Dahlqvist · May 29, 2019, 7:56am

The amount of heap is often what limits how much data you can store on a node and increasing off-heap memory doe not necessarily affect how much data the node can hold. It therefore makes a lot of sense relating it to heap, although this is as far as I know not the way it is generally done.

zidane28 · June 9, 2019, 2:23am

Sorry for the late reply as I'm away to a place with limited internet access.

Based on the points above, can I conclude that the max GB of indexes that a data node can support is:

30GB (Max Heap that ES can support optimally) * (ratio that you want to use) ?

So in this case, if the ratio is 1: 16, so the GB of indexes the data node can hold is 30 * 16 = 480 GB?

Christian_Dahlqvist · June 9, 2019, 5:57am

On Elastic Cloud you can create nodes of different sizes but the relation between CPU, RAM (and therefore heap) and storage is constant per type of node. This is where I believe the RAM-to-disk ratio came from, as it is used to describe how much storage you get allocated in relation to the size of the node. It can therefore describe how much storage a node has or just as well how much data it holds.

If you have a node with 16GB RAM (8 GB heap) and your node type is highio, which offers a RAM-to-disk ratio of 1:30, this node will have 16 * 30 GB = 480GB storage attached. This typically works as long as the heap size is 50% of allocated RAM.

Does that make sense?

zidane28 · June 11, 2019, 2:48am

Yes it does! Thanks!

system · July 9, 2019, 2:48am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch RAM to Disk Ratio Elasticsearch	2	2066	November 15, 2018
Elastic RAM:Disk Ratio Minimum Elasticsearch	5	7251	June 15, 2018
24:1 ratio Elasticsearch	3	1099	August 13, 2018
Clarification about recommended memory-disk ratio of 1:30 Elasticsearch	5	16251	March 31, 2020
Elastic Memory:Storage Ratio / Hot-warm Elasticsearch	3	4823	September 23, 2020

Request for explanation of RAM DISK ratio in ES

Related topics