Hi community,I'm confused about applying the 1:30 RAM:disk ratio for hot nodes with the JVM heap limit.
My understanding:1:30 ratio means 1 GB RAM per 30 GB disk capacity for hot nodes
For 2 TB (2000 GB) disk → need ~66 GB RAM total (~31 GB heap + rest for OS cache)
For 4 TB disk → need ~133 GB RAM totalBut here's my confusion: Official docs recommend max ~30-31 GB JVM heap (Xms/Xmx) per node for compressed oops performance
Questions:Does this mean hot nodes are practically limited to ~2 TB disk max (with 64 GB total RAM, 31 GB heap)?For 4+ TB disk per node, do I need 128+ GB total RAM but keep heap at 31 GB (extra RAM = OS cache)? Or should I avoid >2 TB disk per hot node regardless of total RAM available?
TL;DR: With 31 GB heap limit, can hot nodes really handle 4+ TB disk safely? What's the practical max disk per hot node these days?Context: Planning hot nodes with NVMe SSDs. ES 8.x.
I think you overstating the 1:30 ratio, which is a ballpark heuristic at best. Also, assigning 50% RAM as heap (up to ~31GB) is also just a suggested guideline, though more often heeded as it's the OOTB setting. But you will find threads on here where people are suggested to use more or less RAM for heap, it depends.
What do you mean safely? Anyways, the answer to these Qs is almost always "it depends". and what it depends on is the workload.
btw, since 9.x has been out a while and has reached 9.2.x already, why start now with 8.x?
A hot-warm architecture is generally based on the following assumptions:
You have a subset of nodes (hot nodes) with very fast storage and other nodes with slower storage.
You are indexing immutable time-series data into some form of time-based indices that then age through the cluster.
Indexing in Elasticsearch can be very I/O intensive so all indexing is to be performed by the hot nodes as they have superior I/O performance.
The most recently indexed data is the most relevant and therefore most frequently queried. Older data is queried less frequenly. As the hot nodes hold the most recently created indices, this means that they also often handle a large portion of the query load.
When sizing hot nodes and deciding how much data these are to hold, e.g. disk to RAM ratio, the ideal will depend on the expected ingest rate per node. If you push then to the limit with indexing, holding a lot of data that is frequently queried can cause problems with CPU usage, disk I/O, RAM or heap usage. If the indexing rate per hot node is expected to be lower you may get away with holding more data on these nodes without suffering performance problems.
The ratios used in the cloud are just generic recommendations that generally work for a lot of use cases and not something you necessarily need to follow. I would recommend benchmarking with a realistic mix of indexing and query load (not just benchmark indexing and hope querying at that indexing rate will be fine) to see how much data your specification of hot nodes can support for your load profile before latencies become unacceptable.
I’m not sure how you went from what I wrote to this question, but it’s a bit alarming. Are you, or any of your team, experienced with Elasticsearch? Has anyone done any Elastic courses, certifications? Do you perhaps have access to an experienced Architect to help here ?
Do not get stuck on the 1:30 RAM to disk ratio. A node with 30GB heap can often handle 4TB data on disk if the load allows. If the heap size is insufficient for the load and does not allow this and you have very large hosts it is common to virtualize and run multiple nodes per host instead of very large nodes.
If you lack or have limited experience with Elasticsearch I do agree with the recommendation to seek assistance from someone that does.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.