Elasticsearch 3 node cluster failing if master is down second time

samdash · May 27, 2017, 8:09pm

I have a 3 node cluster and wanted to know the best possible configuration of elastic search. Below is the configuration I have tried.

Server 1,2,3

node.name : node1 [ node2, node3 ]
node.master: true
node.data: true

it works fine, however while testing It is failing in one scenario.Below are the tests i have performed.

Test 1

Bring All 3 nodes up and shutdown master node in this case node1 is mater , Cluster state goes to yellow and back to Green after a while with 2 modes running.
Now bring back the node that was shutdown , now all 3 nodes were up
and now node2 is elected as master and everything is working fine

Test2

After Test 1 is run and all 3 nodes were up since node2 is
elected as master, I have shutdown node2 , now i am not able to connect
and get failed to connect error below

curl: (7) Failed to connect to localhost port 9200: Connection refused

However node1 which is elected as master show's below message with Cluster health status changed from [YELLOW] to [GREEN]

Message


[2017-05-26T22:52:23,996][INFO ][o.e.c.r.a.AllocationService] [node-1]
Cluster health status changed from [GREEN] to [YELLOW] (reason: [{node-2}{6THtjHP4SuKsrAZYcOv6Sw}{wyC6CjSESOOisEMTOE_wCw}{127.0.0.1}{127.0.0.1:9301} transport disconnected, {node-2}{6THtjHP4SuKsrAZYcOv6Sw}{wyC6CjSESOOisEMTOE_wCw}{127.0.0.1}{127.0.0.1:9301} transport disconnected]).
[2017-05-26T22:52:23,997][INFO ][o.e.c.s.ClusterService   ] [node-1] removed  {{node-2}{6THtjHP4SuKsrAZYcOv6Sw}{wyC6CjSESOOisEMTOE_wCw}{127.0.0.1}{127.0.0.1:9301},}, reason: zen-disco-node-failed({node-2}{6THtjHP4SuKsrAZYcOv6Sw} {wyC6CjSESOOisEMTOE_wCw}{127.0.0.1}{127.0.0.1:9301}), reason(transport  disconnected)[{node-2}{6THtjHP4SuKsrAZYcOv6Sw}{wyC6CjSESOOisEMTOE_wCw}{127.0.0.1} {127.0.0.1:9301} transport disconnected, {node-2}{6THtjHP4SuKsrAZYcOv6Sw}{wyC6CjSESOOisEMTOE_wCw}{127.0.0.1}{127.0.0.1:9301} transport disconnected]
[2017-05-26T22:52:24,099][INFO ][o.e.c.r.DelayedAllocationService] [node-1] scheduling reroute for delayed shards in [59.7s] (3 delayed shards)
[2017-05-26T22:53:26,115][INFO ][o.e.c.r.a.AllocationService] [node-1] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[shakespeare][3], [shakespeare][4]] ...]).

Test3


After Test 1 is run and all 3 nodes were up since  **node2**  is elected as 
master, I have shutdown  **node3** , now everything is working fine
since I did not shutdown master the second time

warkolm · May 27, 2017, 10:35pm

This is expected behaviour given your configuration.

However is there something you'd like to ask here, as it's not clear

samdash · May 27, 2017, 10:42pm

thanks for response warkolm, So there is no solution ? I am randomly bring down the master node , first time when i bring down it works fine, however when i bring down the second time, it fails.
all my 3 nodes are master eligible
node.master : true
node.data : true

not sure why cluster does not respond the second time , let me know what is the best configuration for 3 node cluster

warkolm · May 27, 2017, 11:03pm

Are you talking about when you curl the node you stopped? That's entirely expected.

samdash · May 28, 2017, 1:02am

when i run curl command for cluster health , it is not working.
curl -XGET 'localhost:9200/_cluster/health?pretty'
curl: (7) Failed to connect to localhost port 9200: Connection refused

My elastic head , does not show any nodes , when i shut down the master second time only.

warkolm · May 28, 2017, 1:48am

If you are on that node and then stop ES then of course it won't respond. You need to contact a different node.

samdash · May 28, 2017, 7:35pm

Thanks, currently i am testing 3 nodes cluster on my mac ( single machine) . maybe that is the reason.
anyhow , sometimes i am able to connect to the cluster sometimes not. will debug more.

warkolm · May 28, 2017, 8:46pm

Right, well that makes more sense!

See how that is not 9300? When you start another node on the same host, ES picks 9300+1, if you start another it's 9301+2 etc etc. It's the same for the HTTP port on 9200. First node gets 9200, second 9201, third 9203.

So if you stop the first node you need to curl localhost:9201.

samdash · May 28, 2017, 8:53pm

Oh! Thanks a lot, it worked like you mentioned , Appreciate it a lot.

system · June 25, 2017, 8:53pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES failed to connect to master node Elasticsearch	10	717	November 15, 2023
Clustered elasticsearch setup (two master nodes) Elasticsearch	3	1712	October 29, 2020
Elastic cluster - 3nodes (1master - 2 data) Elasticsearch	21	1514	August 14, 2019
Master Node election not working Elasticsearch elastic-agent	3	1268	September 12, 2020
Elasticsearch 3 nodes cluster not joining with each other Elasticsearch	17	2311	July 30, 2021

Elasticsearch 3 node cluster failing if master is down second time

Related topics