Expected Heap Memory Requirements for node.data==false, node.master==false

klahnakoski · July 13, 2015, 7:05pm

Good Day!

I have a cluster with 3billion records, 2T data, 24 shards, and 1 replica. It is split into two zones: Zone 1 has a single machine, as the master node, with all 24 shards. Zone 2 has multiple nodes (usually 10) that share the replicas. This configuration was chosen to minimize AWS costs; with Zone 2 being spot instances. This cluster is optimized for low cost (as it is part of an experiment), and I am working on reducing the number of OoM errors. Search speed is not a priority at this time because are customers are internal and know what to expect in terms of latency.

The master node, being the only permanent node, gets all search requests, and has the highest heap requirements ( average peak of 25gigabytes). The other nodes only require 5gig or 10gig, depending on the number of shards they have. I would like to reduce the heap requirements on the master node. I have already converted all properties to doc_values==true for excellent heap reductions, but I want to get lower.

I was considering adding a permanent node to Zone2 (node.data==false, node.master==false) which will be used to accept all search requests. I want to do this because I believe the current configuration is not fully utilizing the shards in Zone2, rather preferring to use shards on master in Zone1. I am apprehensive about setting up this node because the OnDemand prices are much higher (5x, approximately) than the Spot prices.

If I setup a non-data node, how to set the heap settings? Do I continue to set heap to 50% of total memory, or can I crank it up to 90% (because there is no data, there are no Lucene indexes, and minimal need for drive cache)?

Comments? Suggestions?

Thank you!

warkolm · July 13, 2015, 9:42pm

You can increase heap to 75% (or more if you want to live on the edge) for client nodes, you don't have to worry about FS caching for these after all.

unknownunknown · November 25, 2015, 6:52pm

Does the 30.5GB rule hold true here, as well?

warkolm · November 26, 2015, 12:41am

Yes..

Srinath_C · November 26, 2015, 1:19am

The master node, being the only permanent node, gets all search requests...

Isn't this risky? With increase in search load, the master node might observe memory issues and may not react to cluster membership events.

warkolm · November 26, 2015, 8:34pm

It sure is.

Topic		Replies	Views
Sizing master-only nodes Elasticsearch	8	11129	July 5, 2017
How does heap shortage in the masters affect datanodes? Elasticsearch	17	2527	July 5, 2017
Master only node hardware sizing Elasticsearch	6	4505	July 6, 2017
Elasticsearch master node having high memory pressure Elasticsearch	5	6199	November 22, 2019
Heap size percentage for master and client nodes Elasticsearch	3	1324	July 5, 2017

Expected Heap Memory Requirements for node.data==false, node.master==false

Related topics