Unicast discovery fails to connect to master

Robin_Verlangen · February 14, 2014, 7:30am

Hi there,

We're having issues with a cluster that fails to connect to it's master
repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[
b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting
for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to[[#zen_unicast_1#][inet[
b001.my-cluster.com/85.17.231.xxx:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[
b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast] request_id
[18] timed out after [3750ms]
at
org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left
[[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]},
publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9k9gFXtv5qbxuAnoZ%3Di68Vk9RjTmAfyvUve899nddfTg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Robin_Verlangen · February 14, 2014, 7:36am

In addition to my previous question, is it correct that version 0.90.11
only pings one port, instead of the entire 9300-9400 range?

github.com

elastic/elasticsearch/blob/a4b2366e1e50953b7308b21963133dc50dd3fc60/src/main/java/org/elasticsearch/discovery/zen/ping/unicast/UnicastZenPing.java#L112


      
              List<String> hosts = Lists.newArrayList(hostArr);
              logger.debug("using initial hosts {}, with concurrent_connects [{}]", hosts, concurrentConnects);
          
              List<DiscoveryNode> nodes = Lists.newArrayList();
              int idCounter = 0;
              for (String host : hosts) {
                  try {
                      TransportAddress[] addresses = transportService.addressesFromString(host);
                      // we only limit to 1 addresses, makes no sense to ping 100 ports
                      for (int i = 0; (i < addresses.length && i < LIMIT_PORTS_COUNT); i++) {
                          nodes.add(new DiscoveryNode("#zen_unicast_" + (++idCounter) + "#", addresses[i], version));
                      }
                  } catch (Exception e) {
                      throw new ElasticsearchIllegalArgumentException("Failed to resolve address for [" + host + "]", e);
                  }
              }
              this.nodes = nodes.toArray(new DiscoveryNode[nodes.size()]);
          
              transportService.registerHandler(UnicastPingRequestHandler.ACTION, new UnicastPingRequestHandler());
          }

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:

Hi there,

We're having issues with a cluster that fails to connect to it's master
repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[
b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting
for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to[[#zen_unicast_1#][inet[
b001.my-cluster.com/85.17.231.xxx:9300]http://b001.my-cluster.com/85.17.231.xxx:9300]
]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [inet[
b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicasthttp://b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast]
request_id [18] timed out after [3750ms]
at
org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]},
publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

dadoonet · February 14, 2014, 7:39am

Is the master node under pressure? I meant pressure on memory (JVM old generation GC).

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:30, Robin Verlangen robin@us2.nl a écrit :

Hi there,

We're having issues with a cluster that fails to connect to it's master repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to [[#zen_unicast_1#][inet[b001.my-cluster.com/85.17.231.xxx:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast] request_id [18] timed out after [3750ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9k9gFXtv5qbxuAnoZ%3Di68Vk9RjTmAfyvUve899nddfTg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3832A79E-CCFB-43D3-8139-E0090BC24508%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.

dadoonet · February 14, 2014, 7:46am

Ping will ping one port.
If you did not set port in unicast list, 9300 is assumed I guess.

Modify elasticsearch.yml file and set the "right" port for this node.

HTH

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:36, Robin Verlangen robin@us2.nl a écrit :

In addition to my previous question, is it correct that version 0.90.11 only pings one port, instead of the entire 9300-9400 range?

github.com

elastic/elasticsearch/blob/a4b2366e1e50953b7308b21963133dc50dd3fc60/src/main/java/org/elasticsearch/discovery/zen/ping/unicast/UnicastZenPing.java#L112


      
              List<String> hosts = Lists.newArrayList(hostArr);
              logger.debug("using initial hosts {}, with concurrent_connects [{}]", hosts, concurrentConnects);
          
              List<DiscoveryNode> nodes = Lists.newArrayList();
              int idCounter = 0;
              for (String host : hosts) {
                  try {
                      TransportAddress[] addresses = transportService.addressesFromString(host);
                      // we only limit to 1 addresses, makes no sense to ping 100 ports
                      for (int i = 0; (i < addresses.length && i < LIMIT_PORTS_COUNT); i++) {
                          nodes.add(new DiscoveryNode("#zen_unicast_" + (++idCounter) + "#", addresses[i], version));
                      }
                  } catch (Exception e) {
                      throw new ElasticsearchIllegalArgumentException("Failed to resolve address for [" + host + "]", e);
                  }
              }
              this.nodes = nodes.toArray(new DiscoveryNode[nodes.size()]);
          
              transportService.registerHandler(UnicastPingRequestHandler.ACTION, new UnicastPingRequestHandler());
          }

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:
Hi there,

We're having issues with a cluster that fails to connect to it's master repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to [[#zen_unicast_1#][inet[b001.my-cluster.com/85.17.231.xxx:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [inet[b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast] request_id [18] timed out after [3750ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/15BAFA91-F0DB-4B7A-A20D-E811872B0B6F%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.

Robin_Verlangen · February 14, 2014, 7:51am

Hi David,

Ok, I think the latter (the port range) is the actual issue. Would you be
able to think with me on a possible solution?

We run 3 ES applications per node, each with 2-4 ES clients. Would it then
be best to specify
b001.my-cluster.com[9300-9312],b002.my-cluster.com[9300-9312]
et cetera?

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:46 AM, David Pilato david@pilato.fr wrote:

Ping will ping one port.
If you did not set port in unicast list, 9300 is assumed I guess.

Modify elasticsearch.yml file and set the "right" port for this node.

HTH

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:36, Robin Verlangen robin@us2.nl a écrit :

In addition to my previous question, is it correct that version 0.90.11
only pings one port, instead of the entire 9300-9400 range?

https://github.com/elasticsearch/elasticsearch/blob/a4b2366e1e50953b7308b21963133dc50dd3fc60/src/main/java/org/elasticsearch/discovery/zen/ping/unicast/UnicastZenPing.java#L112

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:

Hi there,

We're having issues with a cluster that fails to connect to it's master
repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[
b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting
for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to[[#zen_unicast_1#][inet[
b001.my-cluster.com/85.17.231.xxx:9300]http://b001.my-cluster.com/85.17.231.xxx:9300]
]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [inet[
b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicasthttp://b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast]
request_id [18] timed out after [3750ms]
at
org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]},
publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/15BAFA91-F0DB-4B7A-A20D-E811872B0B6F%40pilato.fr
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB_xy%2BZoOhma5t7O0YG7rbh4WM910-jNeU7OqFJXBkgaPw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

dadoonet · February 14, 2014, 8:07am

Not sure about your architecture. May be you have good reasons for that but running more than one node per machine is not what I'd recommend.
But here, may be they are "client only nodes"?

Specifying port range is OK. When your node starts, it tries to ping b001:9300, then b001:9301, …
One node should at least answer otherwise, current node will think it's alone and will set itself as master.

Once the cluster state is get from master node to the current node, current node knows exactly which nodes forms the cluster and on which port for each node.
So, when pinging, only the right port is pinged. If the expected node does not answer to ping request, it will be considered as leaving the cluster.

Makes sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 14 février 2014 à 08:51:47, Robin Verlangen (robin@us2.nl) a écrit:

Hi David,

Ok, I think the latter (the port range) is the actual issue. Would you be able to think with me on a possible solution?

We run 3 ES applications per node, each with 2-4 ES clients. Would it then be best to specify b001.my-cluster.com[9300-9312],b002.my-cluster.com[9300-9312] et cetera?

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:46 AM, David Pilato david@pilato.fr wrote:
Ping will ping one port.
If you did not set port in unicast list, 9300 is assumed I guess.

Modify elasticsearch.yml file and set the "right" port for this node.

HTH

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:36, Robin Verlangen robin@us2.nl a écrit :

In addition to my previous question, is it correct that version 0.90.11 only pings one port, instead of the entire 9300-9400 range?

github.com

elastic/elasticsearch/blob/a4b2366e1e50953b7308b21963133dc50dd3fc60/src/main/java/org/elasticsearch/discovery/zen/ping/unicast/UnicastZenPing.java

/*
 * Licensed to Elasticsearch under one or more contributor
 * license agreements. See the NOTICE file distributed with
 * this work for additional information regarding copyright
 * ownership. Elasticsearch licenses this file to you under
 * the Apache License, Version 2.0 (the "License"); you may
 * not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *    http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

package org.elasticsearch.discovery.zen.ping.unicast;

This file has been truncated. show original

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:
Hi there,

We're having issues with a cluster that fails to connect to it's master repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to [[#zen_unicast_1#][inet[b001.my-cluster.com/85.17.231.xxx:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast] request_id [18] timed out after [3750ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/15BAFA91-F0DB-4B7A-A20D-E811872B0B6F%40pilato.fr.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB_xy%2BZoOhma5t7O0YG7rbh4WM910-jNeU7OqFJXBkgaPw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52fdceda.6b8b4567.17b2%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.

Robin_Verlangen · February 14, 2014, 8:10am

Hi David,

Yes, that makes sense. Thank you for helping me out.

Some background on the architecture, there is one ES server, and 2 other
applications that are client only. However the order of starting these
services is random (e.g. it is unsure which service picks the 9300 port).

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 9:07 AM, David Pilato david@pilato.fr wrote:

Not sure about your architecture. May be you have good reasons for that
but running more than one node per machine is not what I'd recommend.
But here, may be they are "client only nodes"?

Specifying port range is OK. When your node starts, it tries to ping
b001:9300, then b001:9301, ...
One node should at least answer otherwise, current node will think it's
alone and will set itself as master.

Once the cluster state is get from master node to the current node,
current node knows exactly which nodes forms the cluster and on which port
for each node.
So, when pinging, only the right port is pinged. If the expected node does
not answer to ping request, it will be considered as leaving the cluster.

Makes sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 14 février 2014 à 08:51:47, Robin Verlangen (robin@us2.nl) a écrit:

Hi David,

Ok, I think the latter (the port range) is the actual issue. Would you be
able to think with me on a possible solution?

We run 3 ES applications per node, each with 2-4 ES clients. Would it then
be best to specify b001.my-cluster.com[9300-9312],b002.my-cluster.com[9300-9312]
et cetera?

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:46 AM, David Pilato david@pilato.fr wrote:

Ping will ping one port.
If you did not set port in unicast list, 9300 is assumed I guess.

Modify elasticsearch.yml file and set the "right" port for this node.

HTH

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:36, Robin Verlangen robin@us2.nl a écrit :

In addition to my previous question, is it correct that version
0.90.11 only pings one port, instead of the entire 9300-9400 range?

https://github.com/elasticsearch/elasticsearch/blob/a4b2366e1e50953b7308b21963133dc50dd3fc60/src/main/java/org/elasticsearch/discovery/zen/ping/unicast/UnicastZenPing.java#L112

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:

Hi there,

We're having issues with a cluster that fails to connect to it's master
repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[
b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting
for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to[[#zen_unicast_1#][inet[
b001.my-cluster.com/85.17.231.xxx:9300]http://b001.my-cluster.com/85.17.231.xxx:9300]
]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [inet[
b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicasthttp://b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast]
request_id [18] timed out after [3750ms]
at
org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address
{inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/15BAFA91-F0DB-4B7A-A20D-E811872B0B6F%40pilato.fr.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADVHTB_xy%2BZoOhma5t7O0YG7rbh4WM910-jNeU7OqFJXBkgaPw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.52fdceda.6b8b4567.17b2%40MacBook-Air-de-David.local
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB-EAPGEiPCbUcdaBmTh_%2BHBda%2BgWMi3YZ1v39dujocwzQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Data nodes are not able to join master node and failed to make a cluster Elasticsearch	13	2609	September 7, 2018
Help please with ES and EC2 cluster formation Elasticsearch	1	518	May 22, 2014
Failed to send ping to zen_unicast_1 Elasticsearch	4	3089	July 17, 2013
Multicast ping vs port 9300 Elasticsearch	3	840	March 18, 2013
Can't join cluster Elasticsearch	4	483	October 17, 2012

Unicast discovery fails to connect to master

For more options, visit https://groups.google.com/groups/opt_out.

For more options, visit https://groups.google.com/groups/opt_out.

Related topics