Maximum RAM recommended for data node

malshan · January 14, 2020, 5:40am

Hi all,

I'm new here, for please forgive for any changes from expected format.
ES version: 7.5

I'm going to deploy a hot-warm-cold architecture, with dedicated data, master and ingest nodes. However, even our 'hot' section of data is to be kept for 180 days, and would be around 50TB (total, including replicas).
Keeping a disk to ram ratio of 24, i will still need about 40 nodes to handle this, if i'm to keep it below 64GB memory for a node ( i have heard somewhere not to go above 64GB per node).

Is there an actually a recommended limit of 64GB, or can i go up to 256GB or so without issues per node?
And that so called disk:ram ratio, is it calculated using replicated total size of disk or using a single replica only.

Thanks.

Christian_Dahlqvist · January 14, 2020, 6:16am

If you have a retention period of 180 days for your hot tier it is in my opinion no longer a hot tier but rather a warm tier that does indexing. With 180 days of data per node less than a percent of data held on the node will be indexed and relocated to the warm zone per day. This sounds more like a warm-lukewarm-cold architecture.

The purpose of hot nodes is to use as few nodes as possible for indexing as this requires a lot of CPU, memory as well as disk I/O. Querying data held on warm nodes can therefore often be faster than hot nodes as they solely serve queries and are considerably less busy.

If the driver for this hot retention period is that you need to support a high volume of queries up to 180 days it is possible that what you really need is a warm zone backed by SSDs or possibly even a four zone system. ILM is as far as I know limited to 3 zones, but if you use Curator you can have as many zones as you like.

malshan · January 14, 2020, 6:27am

Hi Christian_Dahlqvist,

Thanks for the reply. We are now going to discuss with our client whether they really need a 180 day hot section. As it looks impractical.
BTW, is there still a limit of 64GB RAM per node for practical scenarios.

Thanks and regards.

Christian_Dahlqvist · January 14, 2020, 6:29am

As far as I know the recommendation is still to keep the heap at or below 30GB even if you are using G1GC, but it would be better to get someone from Elastic to comment on that.

This blog post describes the node types and talks about the fact that moving data to warm nodes not necessarily make it slower to query. I recall talking to a user with a very large cluster at Elastic{ON}, and his view was that getting data off hot nodes as quickly as possible gave users the best query performance. He therefore only held 3 or 4 days worth of data on the hot nodes to make them truly hot. He had a much lower disk-to-RAM ratio than you mentioned and made sure the disks were never filling up. As far as I recall he used fast local spinning disks for his warm tier, but I have seen using SSDs here becoming more and more common as well.

malshan · January 14, 2020, 6:48am

Hi Christian_Dahlqvist, Thanks for the help provided.

DavidTurner · January 14, 2020, 7:11am

That's correct. The heap size recommendations (and the reasons for them) are documented in the manual.

Note that it's possible to run more than one node per host if you have a lot of RAM available, but also note that the filesystem cache is an important contributor to Elasticsearch's performance. It's important to leave enough free memory for the OS to work its caching magic.

system · February 11, 2020, 7:12am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is there Disk Limit on Warm/Hot Nodes Elasticsearch	6	5024	June 13, 2018
Elastic Search Warm node taking too much of Ram than the Heap assigned to that Elasticsearch	1	323	October 29, 2020
Is it normal to have 6 ES nodes with 100GB heap size each on 6x 192GB RAM servers? Elasticsearch	8	2154	July 5, 2017
Elastic Memory:Storage Ratio / Hot-warm Elasticsearch	3	4772	September 23, 2020
Elasticsearch RAM to Disk Ratio Elasticsearch	2	2060	November 15, 2018

Maximum RAM recommended for data node

Related topics