Can't get Nodes to join AWS cluster

This is driving me insance, I am trying to setup a three node AWS cluster.
Each work fine individually, but cluster isn't working. I am using latest
of everything, ES 1.3.4, AWS plugin 2.3.0

Note in the logs below, node 2 shows that it was elected as a master, yet
node 1 doesn't know about it.

yml: (identical for both nodes)

discovery.zen.minimum_master_nodes:2

discovery.zen.ping.multicast.enabled: false

discovery.zen.ping.unicast.hosts: [node1,node2,node3]
cloud:

aws:

access_key:########
secret_key:#################

discovery:
type:ec2

cluster
"status": "green",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 0,
"active_shards": 0,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0
}

Here is what I get in log:

[2014-09-30 20:38:06,279][INFO ][node ] [Rex 1] version[
1.3.4], pid[11632], build[a70f3cc/2014-09-30T09:07:17Z]

[2014-09-30 20:38:06,279][INFO ][node ] [Rex 1]
initializing ...

[2014-09-30 20:38:06,320][INFO ][plugins ] [Rex 1] loaded [
mapper-attachments, cloud-aws], sites []

[2014-09-30 20:38:08,537][INFO ][node ] [Rex 1]
initialized

[2014-09-30 20:38:08,537][INFO ][node ] [Rex 1]
starting ...

[2014-09-30 20:38:08,628][INFO ][transport ] [Rex 1]
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/10.164.184.135:9300]}

[2014-09-30 20:38:08,637][INFO ][discovery ] [Rex 1]
rexCluster/cV8zAk42Ql6RPrAov4LRhQ

[2014-09-30 20:38:11,649][INFO ][cluster.service ] [Rex 1]
new_master [Rex 1][cV8zAk42Ql6RPrAov4LRhQ][ip-10-164-184-135][inet[/10.164.184.135:9300]],
reason: zen-disco-join (elect

ed_as_master)

[2014-09-30 20:38:11,665][INFO ][http ] [Rex 1]
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/10.164.184.135:9200]}

[2014-09-30 20:38:11,666][INFO ][node ] [Rex 1] started

[2014-09-30 20:38:11,682][INFO ][gateway ] [Rex 1]
recovered [0] indices into cluster_state

[2014-09-30 20:42:44,368][INFO ][node ] [Rex 1]
stopping ...

[2014-09-30 20:42:44,389][INFO ][node ] [Rex 1] stopped

[2014-09-30 20:42:44,390][INFO ][node ] [Rex 1] closing
...

[2014-09-30 20:42:44,395][INFO ][node ] [Rex 1] closed

[2014-09-30 20:42:45,833][WARN ][common.jna ] Unable to lock
JVM memory (ENOMEM). This can result in part of the JVM being swapped out.
Increase RLIMIT_MEMLOCK (ulimit).

[2014-09-30 20:42:45,915][INFO ][node ] [Rex 1]
version[1.3.4], pid[11744], build[a70f3cc/2014-09-30T09:07:17Z]

[2014-09-30 20:42:45,915][INFO ][node ] [Rex 1]
initializing ...

[2014-09-30 20:42:45,953][INFO ][plugins ] [Rex 1] loaded
[mapper-attachments, cloud-aws], sites []

[2014-09-30 20:42:48,165][INFO ][node ] [Rex 1]
initialized

[2014-09-30 20:42:48,166][INFO ][node ] [Rex 1]
starting ...

[2014-09-30 20:42:48,254][INFO ][transport ] [Rex 1]
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/10.164.184.135:9300]}

[2014-09-30 20:42:48,257][INFO ][discovery ] [Rex 1]
rexCluster/3A7q_M1AT4GvVJkhfwzobQ

[2014-09-30 20:43:18,260][WARN ][discovery ] [Rex 1] waited
for 30s and no initial state was set by the discovery

[2014-09-30 20:43:18,266][INFO ][http ] [Rex 1]
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.164.
184.135:9200]}

[2014-09-30 20:43:18,266][INFO ][node ] [Rex 1] started

[2014-09-30 20:44:03,464][DEBUG][action.admin.cluster.state] [Rex 1] no
known master node, scheduling a retry

[2014-09-30 20:44:03,464][DEBUG][action.admin.indices.mapping.get] [Rex 1]
no known master node, scheduling a retry

[2014-09-30 20:39:33,258][INFO ][node ] [Rex 2] version[
1.3.4], pid[7486], build[a70f3cc/2014-09-30T09:07:17Z]

[2014-09-30 20:39:33,258][INFO ][node ] [Rex 2]
initializing ...

[2014-09-30 20:39:33,302][INFO ][plugins ] [Rex 2] loaded [
mapper-attachments, cloud-aws], sites []

[2014-09-30 20:39:35,451][INFO ][node ] [Rex 2]
initialized

[2014-09-30 20:39:35,451][INFO ][node ] [Rex 2]
starting ...

[2014-09-30 20:39:35,540][INFO ][transport ] [Rex 2]
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/10.97.175.169:9300]}

[2014-09-30 20:39:35,555][INFO ][discovery ] [Rex 2]
rexCluster/jkjH8iNhQciZIRwpIezvAw

[2014-09-30 20:39:38,566][INFO ][cluster.service ] [Rex 2]
new_master [Rex 2][jkjH8iNhQciZIRwpIezvAw][ip-10-97-175-169][inet[/10.97.
175.169:9300]], reason: zen-disco-join (elected

_as_master)

[2014-09-30 20:39:38,583][INFO ][http ] [Rex 2]
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.97.
175.169:9200]}

[2014-09-30 20:39:38,583][INFO ][node ] [Rex 2] started

[2014-09-30 20:39:38,602][INFO ][gateway ] [Rex 2]
recovered [0] indices into cluster_state

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/de302bc2-6a4b-4b6d-9a9b-d6fd51466379%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Try setting your logging.yml file with discovery: trace and upload your logs to Gist.
Just replace key/secret from logs with XXXXXX as you probably don't want to share them :slight_smile:

Envoyé de mon iPad

Le 30 sept. 2014 à 22:57, sammy sabdalla80@gmail.com a écrit :

This is driving me insance, I am trying to setup a three node AWS cluster. Each work fine individually, but cluster isn't working. I am using latest of everything, ES 1.3.4, AWS plugin 2.3.0

Note in the logs below, node 2 shows that it was elected as a master, yet node 1 doesn't know about it.

yml: (identical for both nodes)
discovery.zen.minimum_master_nodes:2

discovery.zen.ping.multicast.enabled: false

discovery.zen.ping.unicast.hosts: [node1,node2,node3]
cloud:

aws:

access_key:########
secret_key:#################

discovery:
type:ec2

cluster
"status": "green",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 0,
"active_shards": 0,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0
}

Here is what I get in log:

[2014-09-30 20:38:06,279][INFO ][node ] [Rex 1] version[1.3.4], pid[11632], build[a70f3cc/2014-09-30T09:07:17Z]

[2014-09-30 20:38:06,279][INFO ][node ] [Rex 1] initializing ...

[2014-09-30 20:38:06,320][INFO ][plugins ] [Rex 1] loaded [mapper-attachments, cloud-aws], sites

[2014-09-30 20:38:08,537][INFO ][node ] [Rex 1] initialized

[2014-09-30 20:38:08,537][INFO ][node ] [Rex 1] starting ...

[2014-09-30 20:38:08,628][INFO ][transport ] [Rex 1] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.164.184.135:9300]}

[2014-09-30 20:38:08,637][INFO ][discovery ] [Rex 1] rexCluster/cV8zAk42Ql6RPrAov4LRhQ

[2014-09-30 20:38:11,649][INFO ][cluster.service ] [Rex 1] new_master [Rex 1][cV8zAk42Ql6RPrAov4LRhQ][ip-10-164-184-135][inet[/10.164.184.135:9300]], reason: zen-disco-join (elect

ed_as_master)

[2014-09-30 20:38:11,665][INFO ][http ] [Rex 1] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.164.184.135:9200]}

[2014-09-30 20:38:11,666][INFO ][node ] [Rex 1] started

[2014-09-30 20:38:11,682][INFO ][gateway ] [Rex 1] recovered [0] indices into cluster_state

[2014-09-30 20:42:44,368][INFO ][node ] [Rex 1] stopping ...

[2014-09-30 20:42:44,389][INFO ][node ] [Rex 1] stopped

[2014-09-30 20:42:44,390][INFO ][node ] [Rex 1] closing ...

[2014-09-30 20:42:44,395][INFO ][node ] [Rex 1] closed

[2014-09-30 20:42:45,833][WARN ][common.jna ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).

[2014-09-30 20:42:45,915][INFO ][node ] [Rex 1] version[1.3.4], pid[11744], build[a70f3cc/2014-09-30T09:07:17Z]

[2014-09-30 20:42:45,915][INFO ][node ] [Rex 1] initializing ...

[2014-09-30 20:42:45,953][INFO ][plugins ] [Rex 1] loaded [mapper-attachments, cloud-aws], sites

[2014-09-30 20:42:48,165][INFO ][node ] [Rex 1] initialized

[2014-09-30 20:42:48,166][INFO ][node ] [Rex 1] starting ...

[2014-09-30 20:42:48,254][INFO ][transport ] [Rex 1] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.164.184.135:9300]}

[2014-09-30 20:42:48,257][INFO ][discovery ] [Rex 1] rexCluster/3A7q_M1AT4GvVJkhfwzobQ

[2014-09-30 20:43:18,260][WARN ][discovery ] [Rex 1] waited for 30s and no initial state was set by the discovery

[2014-09-30 20:43:18,266][INFO ][http ] [Rex 1] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.164.184.135:9200]}

[2014-09-30 20:43:18,266][INFO ][node ] [Rex 1] started

[2014-09-30 20:44:03,464][DEBUG][action.admin.cluster.state] [Rex 1] no known master node, scheduling a retry

[2014-09-30 20:44:03,464][DEBUG][action.admin.indices.mapping.get] [Rex 1] no known master node, scheduling a retry

[2014-09-30 20:39:33,258][INFO ][node ] [Rex 2] version[1.3.4], pid[7486], build[a70f3cc/2014-09-30T09:07:17Z]

[2014-09-30 20:39:33,258][INFO ][node ] [Rex 2] initializing ...

[2014-09-30 20:39:33,302][INFO ][plugins ] [Rex 2] loaded [mapper-attachments, cloud-aws], sites

[2014-09-30 20:39:35,451][INFO ][node ] [Rex 2] initialized

[2014-09-30 20:39:35,451][INFO ][node ] [Rex 2] starting ...

[2014-09-30 20:39:35,540][INFO ][transport ] [Rex 2] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.97.175.169:9300]}

[2014-09-30 20:39:35,555][INFO ][discovery ] [Rex 2] rexCluster/jkjH8iNhQciZIRwpIezvAw

[2014-09-30 20:39:38,566][INFO ][cluster.service ] [Rex 2] new_master [Rex 2][jkjH8iNhQciZIRwpIezvAw][ip-10-97-175-169][inet[/10.97.175.169:9300]], reason: zen-disco-join (elected

_as_master)

[2014-09-30 20:39:38,583][INFO ][http ] [Rex 2] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.97.175.169:9200]}

[2014-09-30 20:39:38,583][INFO ][node ] [Rex 2] started

[2014-09-30 20:39:38,602][INFO ][gateway ] [Rex 2] recovered [0] indices into cluster_state

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/de302bc2-6a4b-4b6d-9a9b-d6fd51466379%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/40C87D38-8EAD-4C6F-82CB-4579E420ABF7%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

I am still playing with it, I will upload if I hit a dead end. But here is
strange behavior I noticed, may be you can shed some light.
We switched Unicast IPs to private IPs as opposed to Public IPs.

In Node1: if I do [IP1,IP2,IP3], then node1 & node2 seem to be in a cluster
and node3 isn't
In Node1; if I do [IP1,IP3,IP2], then node1 & node3 seem to be in cluster
and node2 isn't.

Does this mean anything?
Also When I query cluster state, for the 2 nodes that joined, cluster
version says "3", for the non-joing node, cluster version would show "2".

On Wednesday, October 1, 2014 2:20:15 AM UTC-4, David Pilato wrote:

Try setting your logging.yml file with discovery: trace and upload your
logs to Gist.
Just replace key/secret from logs with XXXXXX as you probably don't want
to share them :slight_smile:

Envoyé de mon iPad

Le 30 sept. 2014 à 22:57, sammy <sabda...@gmail.com <javascript:>> a
écrit :

This is driving me insance, I am trying to setup a three node AWS cluster.
Each work fine individually, but cluster isn't working. I am using latest
of everything, ES 1.3.4, AWS plugin 2.3.0

Note in the logs below, node 2 shows that it was elected as a master, yet
node 1 doesn't know about it.

yml: (identical for both nodes)

discovery.zen.minimum_master_nodes:2

discovery.zen.ping.multicast.enabled: false

discovery.zen.ping.unicast.hosts: [node1,node2,node3]
cloud:

aws:

access_key:########
secret_key:#################

discovery:
type:ec2

cluster
"status": "green",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 0,
"active_shards": 0,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0
}

Here is what I get in log:

[2014-09-30 20:38:06,279][INFO ][node ] [Rex 1]
version[1.3.4], pid[11632], build[a70f3cc/2014-09-30T09:07:17Z]

[2014-09-30 20:38:06,279][INFO ][node ] [Rex 1]
initializing ...

[2014-09-30 20:38:06,320][INFO ][plugins ] [Rex 1]
loaded [mapper-attachments, cloud-aws], sites

[2014-09-30 20:38:08,537][INFO ][node ] [Rex 1]
initialized

[2014-09-30 20:38:08,537][INFO ][node ] [Rex 1]
starting ...

[2014-09-30 20:38:08,628][INFO ][transport ] [Rex 1]
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
10.164.184.135:9300]}

[2014-09-30 20:38:08,637][INFO ][discovery ] [Rex 1]
rexCluster/cV8zAk42Ql6RPrAov4LRhQ

[2014-09-30 20:38:11,649][INFO ][cluster.service ] [Rex 1]
new_master [Rex 1][cV8zAk42Ql6RPrAov4LRhQ][ip-10-164-184-135][inet[/10.164.184.135:9300]],
reason: zen-disco-join (elect

ed_as_master)

[2014-09-30 20:38:11,665][INFO ][http ] [Rex 1]
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
10.164.184.135:9200]}

[2014-09-30 20:38:11,666][INFO ][node ] [Rex 1] started

[2014-09-30 20:38:11,682][INFO ][gateway ] [Rex 1]
recovered [0] indices into cluster_state

[2014-09-30 20:42:44,368][INFO ][node ] [Rex 1]
stopping ...

[2014-09-30 20:42:44,389][INFO ][node ] [Rex 1] stopped

[2014-09-30 20:42:44,390][INFO ][node ] [Rex 1]
closing ...

[2014-09-30 20:42:44,395][INFO ][node ] [Rex 1] closed

[2014-09-30 20:42:45,833][WARN ][common.jna ] Unable to lock
JVM memory (ENOMEM). This can result in part of the JVM being swapped out.
Increase RLIMIT_MEMLOCK (ulimit).

[2014-09-30 20:42:45,915][INFO ][node ] [Rex 1]
version[1.3.4], pid[11744], build[a70f3cc/2014-09-30T09:07:17Z]

[2014-09-30 20:42:45,915][INFO ][node ] [Rex 1]
initializing ...

[2014-09-30 20:42:45,953][INFO ][plugins ] [Rex 1] loaded
[mapper-attachments, cloud-aws], sites

[2014-09-30 20:42:48,165][INFO ][node ] [Rex 1]
initialized

[2014-09-30 20:42:48,166][INFO ][node ] [Rex 1]
starting ...

[2014-09-30 20:42:48,254][INFO ][transport ] [Rex 1]
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
10.164.184.135:9300]}

[2014-09-30 20:42:48,257][INFO ][discovery ] [Rex 1]
rexCluster/3A7q_M1AT4GvVJkhfwzobQ

[2014-09-30 20:43:18,260][WARN ][discovery ] [Rex 1]
waited for 30s and no initial state was set by the discovery

[2014-09-30 20:43:18,266][INFO ][http ] [Rex 1]
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.164
.184.135:9200]}

[2014-09-30 20:43:18,266][INFO ][node ] [Rex 1]
started

[2014-09-30 20:44:03,464][DEBUG][action.admin.cluster.state] [Rex 1] no
known master node, scheduling a retry

[2014-09-30 20:44:03,464][DEBUG][action.admin.indices.mapping.get] [Rex 1]
no known master node, scheduling a retry

<span style="color: #660;" class="styled-by

...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d6513c42-917b-467c-bcf6-ba2dc8194231%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.