EC2 Discovery Not Working

So, this one is really puzzling me. I'm sure I've got something
misconfigured, but I can't it out.

I have two ES clusters on EC2 running 0.19.11 that work perfectly. I am
trying to setup a new cluster of two nodes running 0.20.6 on EC2. The two
nodes won't talk to each other unless I use unicast and set the host names
directly in the config. EC2 discovery finds all of the other possible nodes
correctly, but fails to connect to the other nodes on port 9300:

[2013-03-29 11:04:16,642][WARN ][discovery.zen.ping.unicast]
[es3.logstash.ec2.example.com] failed to send ping to
[[#cloud-i-12345678-0][inet[/a.b.c.d:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException:
[][inet[/a.b.c.d:9300]][discovery/zen/unicast] request_id [4] timed out
after [3750ms]

As far as I can tell, the problem is that the nodes are not binding to port
9300. Instead they're binding to 9302:

2013-03-29 11:04:09,250][INFO ][transport ]
[es3.logstash.ec2.example.com] bound_address {inet[/0.0.0.0:9302]},
publish_address {inet[/aa.bb.cc.dd:9302]}

If have verified that the security group allows traffic on the port range
9200-9400. If I try to telnet from one of the two nodes to the other on
port 9300, telnet hangs. If I try to telnet to port 9302, it works. I also
tried disabling EC2 discovery and explicitly listing the nodes using port
9302 (instead of the default 9300) and that worked great:

discovery.zen.ping.unicast.hosts:
["es3.logstash.ec2.example.com:9302","es4.logstash.ec2.example.com:9302"]

If I try explicitly forcing the transport port to 9300 and the http port to
9200, ElasticSearch complains that the ports are already in use and shuts
back down.

My discovery configuration looks like the following:

discovery.type: ec2
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
cloud.node.auto_attributes: true
cloud.aws.region: us-east-1
discovery.ec2.groups: ElasticSearchLogstash

Anything obvious that I'm missing?

-Sean

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Any suggestions?

On Friday, March 29, 2013 9:51:04 AM UTC-7, VegHead wrote:

So, this one is really puzzling me. I'm sure I've got something
misconfigured, but I can't it out.

I have two ES clusters on EC2 running 0.19.11 that work perfectly. I am
trying to setup a new cluster of two nodes running 0.20.6 on EC2. The two
nodes won't talk to each other unless I use unicast and set the host names
directly in the config. EC2 discovery finds all of the other possible nodes
correctly, but fails to connect to the other nodes on port 9300:

[2013-03-29 11:04:16,642][WARN ][discovery.zen.ping.unicast] [
es3.logstash.ec2.example.com] failed to send ping to
[[#cloud-i-12345678-0][inet[/a.b.c.d:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException:
[inet[/a.b.c.d:9300]][discovery/zen/unicast] request_id [4] timed out
after [3750ms]

As far as I can tell, the problem is that the nodes are not binding to
port 9300. Instead they're binding to 9302:

2013-03-29 11:04:09,250][INFO ][transport ] [
es3.logstash.ec2.example.com] bound_address {inet[/0.0.0.0:9302]},
publish_address {inet[/aa.bb.cc.dd:9302]}

If have verified that the security group allows traffic on the port range
9200-9400. If I try to telnet from one of the two nodes to the other on
port 9300, telnet hangs. If I try to telnet to port 9302, it works. I also
tried disabling EC2 discovery and explicitly listing the nodes using port
9302 (instead of the default 9300) and that worked great:

discovery.zen.ping.unicast.hosts: ["es3.logstash.ec2.example.com:9302","
es4.logstash.ec2.example.com:9302"]

If I try explicitly forcing the transport port to 9300 and the http port
to 9200, Elasticsearch complains that the ports are already in use and
shuts back down.

My discovery configuration looks like the following:

discovery.type: ec2
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
cloud.node.auto_attributes: true
cloud.aws.region: us-east-1
discovery.ec2.groups: ElasticSearchLogstash

Anything obvious that I'm missing?

-Sean

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi VH,
stating the obvious maybe, but what else is listening on tcp/9300? (
netstatn -anp | grep LISTEN | grep 9300 ) ?

from the ec2.group you show, it seems you are using Logstash - you know
that in default config logstash starts its own ES cluster... which will
compete for ports with your separate ES cluster... i also find it annoying
but havent found a simple way to turn it off - I solved it in our logstash
cluster by setting Logstash to start after ES , and it seems ok with it...

let us know how it goes...
Beto

On Saturday, March 30, 2013 3:51:04 AM UTC+11, VegHead wrote:

So, this one is really puzzling me. I'm sure I've got something
misconfigured, but I can't it out.

I have two ES clusters on EC2 running 0.19.11 that work perfectly. I am
trying to setup a new cluster of two nodes running 0.20.6 on EC2. The two
nodes won't talk to each other unless I use unicast and set the host names
directly in the config. EC2 discovery finds all of the other possible nodes
correctly, but fails to connect to the other nodes on port 9300:

[2013-03-29 11:04:16,642][WARN ][discovery.zen.ping.unicast] [
es3.logstash.ec2.example.com] failed to send ping to
[[#cloud-i-12345678-0][inet[/a.b.c.d:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException:
[inet[/a.b.c.d:9300]][discovery/zen/unicast] request_id [4] timed out
after [3750ms]

As far as I can tell, the problem is that the nodes are not binding to
port 9300. Instead they're binding to 9302:

2013-03-29 11:04:09,250][INFO ][transport ] [
es3.logstash.ec2.example.com] bound_address {inet[/0.0.0.0:9302]},
publish_address {inet[/aa.bb.cc.dd:9302]}

If have verified that the security group allows traffic on the port range
9200-9400. If I try to telnet from one of the two nodes to the other on
port 9300, telnet hangs. If I try to telnet to port 9302, it works. I also
tried disabling EC2 discovery and explicitly listing the nodes using port
9302 (instead of the default 9300) and that worked great:

discovery.zen.ping.unicast.hosts: ["es3.logstash.ec2.example.com:9302","
es4.logstash.ec2.example.com:9302"]

If I try explicitly forcing the transport port to 9300 and the http port
to 9200, Elasticsearch complains that the ports are already in use and
shuts back down.

My discovery configuration looks like the following:

discovery.type: ec2
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
cloud.node.auto_attributes: true
cloud.aws.region: us-east-1
discovery.ec2.groups: ElasticSearchLogstash

Anything obvious that I'm missing?

-Sean

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Gah! You're right! I'm an idiot!

We already have one Logstash cluster consisting of two indexers and four ES
servers running an older release of Logstash and ES 0.19.11. I was trying
to setup a new Logstash cluster with the latest Logstash and ES 0.20.6.
Even though I run the Logstash indexers on separate boxes from
Elasticsearch, I still run the Logstash Agent on the ES boxes. Everything
is automated with Chef and of course I changed the Chef recipe as well. As
a result, the Logstash Agent config on the ES node was misconfigured with 3
very very important lines:

elasticsearch {
embedded => true
}

Running ES embedded within Logstash means ES doesn't show up in the process
list. But looking at the open ports... whoops.

$ netstat -anp | grep LISTEN | grep 9300
tcp 0 0 0.0.0.0:9300 0.0.0.0:*
LISTEN 4811/java
$ ps ax |grep 4811
4811 ? Ssl 9:42 /usr/bin/java -Xms256M -Xmx256M -jar
logstash-monolithic.jar agent --config /etc/logstash

Absolutely awesome find. Thank you!

-Sean

On Thursday, April 4, 2013 5:45:47 AM UTC-7, Norberto Meijome wrote:

Hi VH,
stating the obvious maybe, but what else is listening on tcp/9300? (
netstatn -anp | grep LISTEN | grep 9300 ) ?

from the ec2.group you show, it seems you are using Logstash - you know
that in default config logstash starts its own ES cluster... which will
compete for ports with your separate ES cluster... i also find it annoying
but havent found a simple way to turn it off - I solved it in our logstash
cluster by setting Logstash to start after ES , and it seems ok with it...

let us know how it goes...
Beto

On Saturday, March 30, 2013 3:51:04 AM UTC+11, VegHead wrote:

So, this one is really puzzling me. I'm sure I've got something
misconfigured, but I can't it out.

I have two ES clusters on EC2 running 0.19.11 that work perfectly. I am
trying to setup a new cluster of two nodes running 0.20.6 on EC2. The two
nodes won't talk to each other unless I use unicast and set the host names
directly in the config. EC2 discovery finds all of the other possible nodes
correctly, but fails to connect to the other nodes on port 9300:

[2013-03-29 11:04:16,642][WARN ][discovery.zen.ping.unicast] [
es3.logstash.ec2.example.com] failed to send ping to
[[#cloud-i-12345678-0][inet[/a.b.c.d:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException:
[inet[/a.b.c.d:9300]][discovery/zen/unicast] request_id [4] timed out
after [3750ms]

As far as I can tell, the problem is that the nodes are not binding to
port 9300. Instead they're binding to 9302:

2013-03-29 11:04:09,250][INFO ][transport ] [
es3.logstash.ec2.example.com] bound_address {inet[/0.0.0.0:9302]},
publish_address {inet[/aa.bb.cc.dd:9302]}

If have verified that the security group allows traffic on the port range
9200-9400. If I try to telnet from one of the two nodes to the other on
port 9300, telnet hangs. If I try to telnet to port 9302, it works. I also
tried disabling EC2 discovery and explicitly listing the nodes using port
9302 (instead of the default 9300) and that worked great:

discovery.zen.ping.unicast.hosts: ["es3.logstash.ec2.example.com:9302","
es4.logstash.ec2.example.com:9302"]

If I try explicitly forcing the transport port to 9300 and the http port
to 9200, Elasticsearch complains that the ports are already in use and
shuts back down.

My discovery configuration looks like the following:

discovery.type: ec2
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
cloud.node.auto_attributes: true
cloud.aws.region: us-east-1
discovery.ec2.groups: ElasticSearchLogstash

Anything obvious that I'm missing?

-Sean

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Glad you solved it :slight_smile:
B

On Fri, Apr 5, 2013 at 2:47 AM, VegHead organicveggie@gmail.com wrote:

Gah! You're right! I'm an idiot!

We already have one Logstash cluster consisting of two indexers and four
ES servers running an older release of Logstash and ES 0.19.11. I was
trying to setup a new Logstash cluster with the latest Logstash and ES
0.20.6. Even though I run the Logstash indexers on separate boxes from
Elasticsearch, I still run the Logstash Agent on the ES boxes. Everything
is automated with Chef and of course I changed the Chef recipe as well. As
a result, the Logstash Agent config on the ES node was misconfigured with 3
very very important lines:

elasticsearch {
embedded => true
}

Running ES embedded within Logstash means ES doesn't show up in the
process list. But looking at the open ports... whoops.

$ netstat -anp | grep LISTEN | grep 9300
tcp 0 0 0.0.0.0:9300 0.0.0.0:*
LISTEN 4811/java
$ ps ax |grep 4811
4811 ? Ssl 9:42 /usr/bin/java -Xms256M -Xmx256M -jar
logstash-monolithic.jar agent --config /etc/logstash

Absolutely awesome find. Thank you!

-Sean

On Thursday, April 4, 2013 5:45:47 AM UTC-7, Norberto Meijome wrote:

Hi VH,
stating the obvious maybe, but what else is listening on tcp/9300? (
netstatn -anp | grep LISTEN | grep 9300 ) ?

from the ec2.group you show, it seems you are using Logstash - you know
that in default config logstash starts its own ES cluster... which will
compete for ports with your separate ES cluster... i also find it annoying
but havent found a simple way to turn it off - I solved it in our logstash
cluster by setting Logstash to start after ES , and it seems ok with it...

let us know how it goes...
Beto

On Saturday, March 30, 2013 3:51:04 AM UTC+11, VegHead wrote:

So, this one is really puzzling me. I'm sure I've got something
misconfigured, but I can't it out.

I have two ES clusters on EC2 running 0.19.11 that work perfectly. I am
trying to setup a new cluster of two nodes running 0.20.6 on EC2. The two
nodes won't talk to each other unless I use unicast and set the host names
directly in the config. EC2 discovery finds all of the other possible nodes
correctly, but fails to connect to the other nodes on port 9300:

[2013-03-29 11:04:16,642][WARN ][discovery.zen.ping.unicast] [
es3.logstash.ec2.example.com] failed to send ping to
[[#cloud-i-12345678-0][inet[/**a.b.c.d:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException:
[inet[/a.b.c.d:9300]][**discovery/zen/unicast] request_id [4] timed
out after [3750ms]

As far as I can tell, the problem is that the nodes are not binding to
port 9300. Instead they're binding to 9302:

2013-03-29 11:04:09,250][INFO ][transport ] [
es3.logstash.ec2.example.com] bound_address {inet[/0.0.0.0:9302]},
publish_address {inet[/aa.bb.cc.dd:9302]}

If have verified that the security group allows traffic on the port
range 9200-9400. If I try to telnet from one of the two nodes to the other
on port 9300, telnet hangs. If I try to telnet to port 9302, it works. I
also tried disabling EC2 discovery and explicitly listing the nodes using
port 9302 (instead of the default 9300) and that worked great:

discovery.zen.ping.unicast.hosts: ["es3.logstash.ec2.example.
com:9302 http://es3.logstash.ec2.example.com:9302","es4.logstash.ec2.*
*example.com:9302 http://es4.logstash.ec2.example.com:9302"]

If I try explicitly forcing the transport port to 9300 and the http port
to 9200, Elasticsearch complains that the ports are already in use and
shuts back down.

My discovery configuration looks like the following:

discovery.type: ec2
discovery.zen.minimum_master_**nodes: 1
discovery.zen.ping.multicast.**enabled: false
cloud.node.auto_attributes: true
cloud.aws.region: us-east-1
discovery.ec2.groups: ElasticSearchLogstash

Anything obvious that I'm missing?

-Sean

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.