How does heap shortage in the masters affect datanodes?

thekad · February 26, 2016, 9:26pm

Hi,

Versions:

Ubuntu 14.04
ES 1.4.2 from elastic repos
Java 1.7.0_95

Cluster:

6 c3.8xlarge data nodes ( http://www.ec2instances.info/?filter=c3.8xlarge ) with 30G of heap configured
3 dedicated masters (originally http://www.ec2instances.info/?filter=r3.large ) with 7G of heap configured, all client requests are routed through these 3 servers

Index:

Single index with 60 shards, 1 primary and 2 replicas, shards ranging from 10G to 40G
260Mil documents
not using doc values (looking into it)

The problem presented itself after attempting to replace the r3.large masters with m3.large ( http://www.ec2instances.info/?filter=m3.large ) with 4G of heap configured. The 3 masters exhibited the following behavior:

In essence: over 75% heap usage and increased amount of GC and CPU (due to GC probably)

At the same time, every datanode exhibited the following behavior:

i.e. the same behavior. During the time we had the smaller masters in charge of the cluster we saw it directly affecting the rest of the cluster. We replaced the masters with m3.xlarge instances ( http://www.ec2instances.info/?filter=m3.xlarge ) because we weren't unsure at the time if the problem was CPU or Heap. Now it looks to me as if Heap pressure was the problem, you will notice the datanodes in the last image heap usage is back to around 75% and old GC back to almost nothing.

A few questions then:

Why the heap pressure in the masters affected the entire cluster? (in such a way that every datanode also showed heap pressure)
In this particular use case, what would be the ideal master setup? My hunch says memory is the factor here (as far as I can tell the active master's process is single-threaded so it doesn't really matter how many cores the machine has?) and having 7G of heap size configured seems to have "fixed" this issue
Is it particularly harmful for the active master to be serving requests? i.e. should I remove the active master from the configured URLs in the services using ES?
Other general suggestions with regards to index/shard size appreciated

Thank you all.

warkolm · February 26, 2016, 10:22pm

How big is your mapping for this monolithic index?

thekad · February 26, 2016, 10:54pm

Hi @warkolm thanks for responding,

We have 9 pretty big mappings for that particular index. I would paste but:

curl -XGET http://localhost:9200/my_index/_mapping?pretty > /tmp/mappings.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 85.5M 100 85.5M 0 0 15.1M 0 0:00:05 0:00:05 --:--:-- 20.3M

So not easy to share. I can tell you though that the mappings have in average 75 properties each.

warkolm · February 26, 2016, 11:18pm

That's quite large, especially if there's 9 of those! ES needs to store that in cluster state, which is held in memory. It does compress is but that's still pretty big.

You're on a pretty old version, can you upgrade?

thekad · February 27, 2016, 12:41am

Looking into the upgrade path to possibly 1.7.5, which will help I am sure.
With regards to my original question, are you suggesting the heap needed by the master is directly tied to the number and size of the mappings?

warkolm · February 27, 2016, 12:42am

It's likely, what was heap use like before the change? Or do you not have metrics from that time.

thekad · February 27, 2016, 12:51am

I do, actually, here's how the old r3.large masters looked like before we attempted to replace them with the smaller instances:

The graphing is kind of messed up because the sampling size is 1mo. You can tell the heap usage on the active master (green) is steady at 75% and the other 2 much lower.

Christian_Dahlqvist · February 27, 2016, 4:05am

If you are sending all requests through these nodes, they are not truly dedicated master nodes, but rather master/client nodes. It is recommended that dedicated master nodes do not serve traffic, but are allowed to just manage the cluster. This will prevent them from suffering from memory pressure intense garbage collection and make the cluster more stable.

If you instead set up dedicated client nodes, you may be able to reduce the size of the dedicated master nodes as they will be doing a lot less work.

thekad · February 27, 2016, 5:23am

Interesting. My initial thought was that resource usage in the masters was so low they could be used for routing, specially the passive masters since they are doing close to nothing. If your suggestion is that I can have routing (client) nodes then what would be the minimum requirements to run these?

Christian_Dahlqvist · February 28, 2016, 3:05am

Truly dedicated master nodes can often be considerable smaller than the ones you used and have heap as a larger portion of the available memory (~75%) than data nodes as they do not need as much file system cache as nodes that hold data. Instances suitable here may be t2.medium or m3.medium unless the cluster state is very large.

The same recommendation around heap size also applies to client nodes, as they also do not store data. In addition to acting as routers, these nodes also coordinate requests and perform the final aggregation steps once data from the underlying shards have been gathered, which is why they can suffer from garbage collection depending on the work load. These are more difficult to size, but a good starting point may be the instances types you previously used together with the higher heap setting.

thekad · February 29, 2016, 4:39am

Ah, here we go we're converging back to size of cluster state, think that's my key problem. I believe I have 2 final questions:

Is there a formula to size the JVM according to the cluster state? i.e. can I grab the mappings and some how get a number of Gb I absolutely need for my heap size?
Is the active master process single-threaded? So far I think so but I haven't dug through the code to make sure.

Many thanks.

warkolm · February 29, 2016, 4:45am

Not really, cluster state is kept in binary form in memory, so the json output is simply a human readable version.
What do you mean by master process exactly?

thekad · February 29, 2016, 4:48am

In my case I have 3 masters, but only one is active at any given time. This active master shows cpu usage on a single core all the time, that to me means that the ES master process is always bound to a single CPU, but I am not 100% positive.

warkolm · February 29, 2016, 5:06am

Check hot_threads against that node to see what is happening.

thekad · February 29, 2016, 9:34pm

sorry, I didn't mean to say 1 core is always pegged, I meant to say when there is activity in the active master it's always a) brief and b) only on a single core. Kinda hard to grab a hot_threads dump when it spikes so briefly.

Christian_Dahlqvist · February 29, 2016, 11:40pm

Most of the work a dedicated master node does is probably linked to maintaining the cluster state, which could very well be single threaded in nature. It is generally not very CPU intensive, which is why I recommended using instances with relatively little CPU.

warkolm · March 1, 2016, 1:16am

Cluster state is single threaded, to ensure consistency.

Topic		Replies	Views
Heap space help Elasticsearch	3	378	November 3, 2017
Master Node gradually moves to Out Of Memory Exception Elasticsearch	6	4079	October 31, 2017
Dedicated master node and client node Elasticsearch	2	628	July 6, 2017
Elastic Master nodes cluster heap usage Elasticsearch	5	611	November 5, 2018
High memory usage on dedicated master nodes Elasticsearch	4	3105	July 6, 2017

How does heap shortage in the masters affect datanodes?

Related topics