Nodes're not discovered in the cluster

Hi All,

I have two nodes in my cluster, after a restart from both nodes, they can't discovery each other.
Firstly, I thought it was the 'elasticsearch-head' plugin issue. However, when I using java code to call the cluster, i got the same result.

Java code:

for (int i = 0; i< 100; i++) { Builder builder = ImmutableSettings.settingsBuilder(); builder.put("client.transport.sniff", true); Settings s = builder.build(); TransportClient tmp = new TransportClient(s); Client client = tmp .addTransportAddress(new InetSocketTransportAddress("192.168.1.115", 9300)) .addTransportAddress(new InetSocketTransportAddress("192.168.1.116", 9300)) ; NodesInfoResponse rsp = client.admin().cluster() .nodesInfo(new NodesInfoRequest()).actionGet(); String str = "Cluster:" + rsp.getClusterName() + ". Active nodes:"; str += rsp.getNodesMap().keySet(); System.out.println(str); Thread.sleep(1000);
	}

Call return:
Scene 1:(Return with the node info output alternatively, but rarely happens

Cluster:bseproductioncluster. Active nodes:[VJurJ8rNSnCsgMu-fQPLrQ]
Cluster:bseproductioncluster. Active nodes:[m__sMnEgRFGuHCDb-H6iiQ]
Cluster:bseproductioncluster. Active nodes:[VJurJ8rNSnCsgMu-fQPLrQ]
Cluster:bseproductioncluster. Active nodes:[VJurJ8rNSnCsgMu-fQPLrQ]
....

Scene 2:(Return with the only one output)

Cluster:bseproductioncluster. Active nodes:[VJurJ8rNSnCsgMu-fQPLrQ]
Cluster:bseproductioncluster. Active nodes:[VJurJ8rNSnCsgMu-fQPLrQ]
Cluster:bseproductioncluster. Active nodes:[VJurJ8rNSnCsgMu-fQPLrQ]
....

OR

Cluster:bseproductioncluster. Active nodes:[m__sMnEgRFGuHCDb-H6iiQ]
Cluster:bseproductioncluster. Active nodes:[m__sMnEgRFGuHCDb-H6iiQ]
Cluster:bseproductioncluster. Active nodes:[m__sMnEgRFGuHCDb-H6iiQ]
....


OR throws

Exception in thread "main" org.elasticsearch.transport.NodeDisconnectedException: [115-es-11][inet[/192.168.1.115:9300]][cluster/nodes/info] disconnected

OR throws the below exception while commented either line '.addTransportAddress(new InetSocketTransportAddress("192.168.1.115", 9300)) or .addTransportAddress(new InetSocketTransportAddress("192.168.1.116", 9300))

Exception in thread "main" org.elasticsearch.common.netty.channel.ChannelException: Failed to create a selector.
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.openSelector(AbstractNioWorker.java:198)

The expected result should be something like:

Cluster:bseproductioncluster. Active nodes:[VJurJ8rNSnCsgMu-fQPLrQ, m__sMnEgRFGuHCDb-H6iiQ]

Right?

Status in head plugin:

<nabble_img src="115.jpg" border="0" alt="115 node"/><nabble_img src="116.jpg" border="0" alt="116 node"/>

This kind of cluster status is not normal, right? However, no matter how many times I restart the nodes, they cant get back to normal..

And what's weird is that, while I start a windows OS ES node, and then start the 115 and 116 Linux Nodes,
three nodes can join in the cluster successfully.
<nabble_img src="226.jpg" border="0" alt="226 node"/>

Rerun the java code, result as below:

Cluster:bseproductioncluster. Active nodes:[nxJ9D5g6SGSdTNyPImXgJA, 7QgigKJwSFmPkrC0unDqfg, XnAQbXWCT-uP6bd4AfKm3Q]
.....
Cluster:bseproductioncluster. Active nodes:[nxJ9D5g6SGSdTNyPImXgJA, 7QgigKJwSFmPkrC0unDqfg, XnAQbXWCT-uP6bd4AfKm3Q]

Can anyone explain how/why does this happen?
My ES version 0.19.11
ES configration changelist:
discovery.zen.ping.timeout: 5s
discovery.zen.minimum_master_nodes: 1

Thanks,
Spancer

So the cluster was correct at one point? If so, your zen discover settings
are correct.

Can the nodes talk to each other over the network? Has your firewall
changed? Can you telnet to port 9300 from the other node?

Have you modified discovery.zen.minimum_master_nodes? If not, try changing
it to 2 to prevent two single node clusters from being created.

Cheers,

Ivan

On Thu, Nov 22, 2012 at 12:48 AM, spancer ray spancer.roc.ray@gmail.comwrote:

Hi All,

I have two nodes in my cluster, after a restart from both nodes, they can't
discovery each other.
Firstly, I thought it was the 'elasticsearch-head' plugin issue. However,
when I using java code to call the cluster, i got the same result.

Java code:

Call return:
Scene 1:(Return with the node info output alternatively, but rarely
happens

Scene 2:(Return with the only one output)

OR

OR throws*

OR throws the below exception while commented either line ' or *

The expected result should be something like:

Right?

Status in head plugin:

http://elasticsearch-users.115913.n3.nabble.com/file/n4025867/115.jpg
http://elasticsearch-users.115913.n3.nabble.com/file/n4025867/116.jpg

This kind of cluster status is not normal, right? However, no matter how
many times I restart the nodes, they cant get back to normal.. Can anyone
explain how/why does this happen?

Thanks,
Spancer

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Nodes-re-not-discovered-in-the-cluster-tp4025867.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--

--

Hi Ivan,

I tried changed discovery.zen.minimum_master_nodes property value to 2. And have the two nodes restarted, but it kept red...

I checked the log, it's normal.

Rerun the java code, throws
Exception in thread "main" org.elasticsearch.client.transport.NoNodeAvailableException: No node available

But after I started the windows node, got green. Cluster gets to normal...
I don't know why the status kept red while within 115 and 116 in the cluster only.

Is there any property that I have to set along with setting discovery.zen.minimum_master_nodes property?

Thanks,
Spancer

The only other properties that I can think of are the gateway/recovery
properties, but the defaults should work well for a 2 node cluster.

gateway.expected_nodes: 2

You did not include the thrown exception or any logs (or the Java code).
You should gist them and share on the list.

Cheers,

Ivan

On Thu, Nov 22, 2012 at 1:27 AM, spancer ray spancer.roc.ray@gmail.comwrote:

Hi Ivan,

I tried changed discovery.zen.minimum_master_nodes property value to 2.
And
have the two nodes restarted, but it kept red...

I checked the log, it's normal.

Rerun the java code, throws

But after I started the windows node, got green. Cluster gets to normal...
I don't know why the status kept red while within 115 and 116 in the
cluster
only.

Is there any property that I have to set along with setting
discovery.zen.minimum_master_nodes property?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Nodes-re-not-discovered-in-the-cluster-tp4025867p4025869.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--

--