I understand the optimum RAM:disk ratio is 1:24. Based on my own testing, I've used 1GB of RAM to support up to 77GB of data in Elastic's (based on Kibana's "store.size"). But that's a bad test since it crashed at that point.
Based on my previous talks with Elastic staff IRL, my previous understanding was that if there's less RAM, Elastic would simply be slower, and that was a tradeoff I was willing to make. But it seems if there is not enough RAM then Elastic just crashes instead.
So my questions:
Is there a maximum ratio that Elastic would say is absolutely wrong? e.g. 1:50 is "max".
Is there some setting that would help me improve this ratio, or prevent Elastic from crashing?
The Elasticsearch store.size is the store size taken by primary & replica shards.
Not exactly slower but prone to the danger of out of memory errors due to a small heap, yes.
Why do you need a ratio? There is no one ratio fits all here I'm afraid. This very much depends.
If you want to prevent crashing you need better optimisation of what you already have to work with your current heap size OR increase heap.
I am asking for the maximum ratio whereby the application will not crash --- there's a big difference.
If there was no such thing as a ratio, the Elastic team wouldn't put 1:24 for Elastic Cloud. So 1:25 would be slightly slower than 1:24. 1:26, slower more. But whatever the case, 1:24 does not crash the system for sure according to Elastic. Does 1:25? 1:30? 1:40? 1:50? That is my question.
The disk-to-RAM ratio currently is 1:24, meaning that you get 24 GB of storage space for each 1 GB of RAM.
Tip For production systems, we recommend not using less than 4 GB of RAM for your cluster, which assigns 2 GB to the JVM heap.
The RAM to disk ratio will depend a lot on your use case, query patterns and latency requirements. 1:24 is not in any way a hard limit for Elasticsearch, but rather what we use in Elastic Cloud as it is suitable for a wide variety of use cases
The ideal ratio will depend on your hardware and your index and query patterns, so the only way to really know is to benchmark with realistic data and queries on the actual hardware. We talked about cluster sizing at Elastic{ON}, and this might give you an idea about how to go about determining the ratio for your use case and hardware.
ref: https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing
As rightly pointed out by @JKhondhu, there isn't any magic number for RAM:disk ratio or Number of shards or Heap size. You may consider looking at this link.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.