ES node RAM allocation on Bigger (> 100GB) RAM systems

Hi All

As per the URL https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html

--> 1. The allocation of RAM for the HEAP should not exceed 32GB RAM or less than 50% of RAM, whichever is lesser, even if there is more RAM available due to JVM pointer functionality. So the safe bet is 31GB RAM when we have > 64GB RAM..
--> 2. It also says "A machine with 64 GB of RAM is the ideal sweet spot, but 32 GB and 16 GB machines are also common." and no where it mentioned we can have more RAM as allocate < 32GB for HEAP.
--->3. also says "Less than 8 GB tends to be counterproductive (you end up needing many, many small machines), and greater than 64 GB has problems "

In the below URL under "I Have a Machine with 1 TB RAM!" section, if I understand correctly, we can allocate < 32GB RAM for HEAP and let remaining all (968GB) RAM for Lucene to use for faster search response (& other OS activities?) .. offcourse I'm not planning anything bigger than 128GB or 256GB RAM..
 https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html

In the Video in URL https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing , it says, it is better to have 1:16 Ratio of RAM:Harddisk.

So, I wanted to get clarification on the followings-

  • I'm planning to go with 128GB or More RAM and allocate 31GB RAM for HEAP and letting 97GB+ RAM for Lucene & other activity. do you see any concerns?
  • Tomorrow due to high number of user searches and getting slow response, can I still increase the RAM to 256GB or 512GB and continue 31GB RAM for HEAP.

I totally understand we should not cross 32GB RAM for HEAP allocation. I wanted to understand what is problem if we allocate more RAM (100+GB) to the Lucense and other OS activities ..

Best Regards
Jayanna Hallur

None. This is Elasticsearch is deployed to back the search for wikipedia, for example. I expect others as well.

It'll help if you find yourself IO bound. I'd make sure I'm running super nice SSDs before going to huge amount of RAM, but after that RAM is how you fix IO problems, yes.

If you are CPU bound on queries you are better off adding more nodes and using more replicas.

Thank you Nik for confirmation. I'm going with EBS volume upto 10000Mbps network speed and that should be more than sufficient for the IO operations.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.