Why does resizing data node VM from 1 core to 2 cores add new node to the cluster?

(Al77056) #1

I created a 3-node cluster on Azure using Elastic's template. All three nodes are master eligible, and Marvel shows there are three nodes in the cluster, which is expected.

I then resized one of the three VM from DS1 to DS2, essentially from a 1 Core 3.5G VM to a 2 Core 7G VM. After resizing, I am surprised to find that there are four nodes in the cluster now as reported by Marvel. Two of the nodes have the same host name and are running on the same VM that's resized.

Is this expected behavior? If it is expected, is this because Elasticsearch works better with a lot of smaller nodes with limited memory and processing power?

(Mark Walkom) #2

It could be that the old node had to "age" out of the marvel dashboards.

(Al77056) #3

It is definitely not a transient behavior. I checked Marvel today again and there are still four nodes reported and I only have 3 data nodes VMs.

I took a look at the elasticsearch script under /etc/init.d, and didn't find any logic that will run multiple instances of elasticsearch. As a matter of fact, there is only one pid in the PID_DIR. However Marvel clearly shows that there are two nodes running on that VM, and it even shows the distribution of shards across the two nodes on the same VM.

(Mark Walkom) #4

What does _cat/nodes show?

(Al77056) #5

_cat/nodes lists two nodes on the resized VM, for a total of four nodes on 3 VMs.

I've resized the VM back to DS1 (1 core with 3.5 G), and the there are still two nodes running on that VM even though it only have 1 core now. So it may not be a dynamic thing that changes on start-up of the VM. Something happened when I resize it the last time.

Where in Elasticsearch can I configure how many nodes to run on a server?

(Mark Walkom) #6

Does a ps show two processes?

(Al77056) #7

Yeah, it listed two running java processes, each taking about the same amount of memory. One of the two processes has a pid matching the pid in /var/run/elasticsearch/elasticsearch.pid.

(Mark Walkom) #8

So there's your problem, something happened and another process was spawned.

(Al77056) #9

I've done some more testing, such as changing to DS3, as well as resizing the VM without shutting it down first. One thing for sure is that running on a 4-core VM does NOT mean there will be 4 data nodes on the VM. It appears that the extra nodes was only introduced when I have less than 3 nodes running during resizing.

Could it be that the Azure set up prefers a minimum of 3 data nodes, and it will try to run 2 nodes on the same VM if it detects there are less than 3 data nodes in the cluster, even though extra nodes may be brought online later?

(system) #10