Why does resizing data node VM from 1 core to 2 cores add new node to the cluster?

Al77056 · October 20, 2016, 7:55pm

I created a 3-node cluster on Azure using Elastic's template. All three nodes are master eligible, and Marvel shows there are three nodes in the cluster, which is expected.

I then resized one of the three VM from DS1 to DS2, essentially from a 1 Core 3.5G VM to a 2 Core 7G VM. After resizing, I am surprised to find that there are four nodes in the cluster now as reported by Marvel. Two of the nodes have the same host name and are running on the same VM that's resized.

Is this expected behavior? If it is expected, is this because Elasticsearch works better with a lot of smaller nodes with limited memory and processing power?

warkolm · October 20, 2016, 9:00pm

It could be that the old node had to "age" out of the marvel dashboards.

Al77056 · October 24, 2016, 12:44pm

It is definitely not a transient behavior. I checked Marvel today again and there are still four nodes reported and I only have 3 data nodes VMs.

I took a look at the elasticsearch script under /etc/init.d, and didn't find any logic that will run multiple instances of elasticsearch. As a matter of fact, there is only one pid in the PID_DIR. However Marvel clearly shows that there are two nodes running on that VM, and it even shows the distribution of shards across the two nodes on the same VM.

warkolm · October 24, 2016, 8:47pm

What does _cat/nodes show?

Al77056 · October 25, 2016, 5:30pm

_cat/nodes lists two nodes on the resized VM, for a total of four nodes on 3 VMs.

I've resized the VM back to DS1 (1 core with 3.5 G), and the there are still two nodes running on that VM even though it only have 1 core now. So it may not be a dynamic thing that changes on start-up of the VM. Something happened when I resize it the last time.

Where in Elasticsearch can I configure how many nodes to run on a server?

warkolm · October 26, 2016, 7:58am

Does a ps show two processes?

Al77056 · October 26, 2016, 11:50am

Yeah, it listed two running java processes, each taking about the same amount of memory. One of the two processes has a pid matching the pid in /var/run/elasticsearch/elasticsearch.pid.

warkolm · October 26, 2016, 12:22pm

So there's your problem, something happened and another process was spawned.

Al77056 · October 26, 2016, 12:23pm

I've done some more testing, such as changing to DS3, as well as resizing the VM without shutting it down first. One thing for sure is that running on a 4-core VM does NOT mean there will be 4 data nodes on the VM. It appears that the extra nodes was only introduced when I have less than 3 nodes running during resizing.

Could it be that the Azure set up prefers a minimum of 3 data nodes, and it will try to run 2 nodes on the same VM if it detects there are less than 3 data nodes in the cluster, even though extra nodes may be brought online later?

Topic		Replies	Views
Shard rebalancing on single-node cluster scaling Elasticsearch	11	1114	July 5, 2017
Scaling ES Cluster with 2 more nodes Elasticsearch	2	495	October 29, 2018
Should Data Nodes still be the same size? Elasticsearch	3	1018	March 23, 2022
How do nodes and shards work in this scenario? Elasticsearch	5	480	October 8, 2019
Add data node to existing cluster with 3 masters and 2 other data nodes Elasticsearch	6	2564	September 6, 2018

Why does resizing data node VM from 1 core to 2 cores add new node to the cluster?

Related topics