Unicast discovery fails to connect to master


(Robin Verlangen) #1

Hi there,

We're having issues with a cluster that fails to connect to it's master
repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[
b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting
for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to[[#zen_unicast_1#][inet[
b001.my-cluster.com/85.17.231.xxx:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[
b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast] request_id
[18] timed out after [3750ms]
at
org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left
[[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]},
publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9k9gFXtv5qbxuAnoZ%3Di68Vk9RjTmAfyvUve899nddfTg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Robin Verlangen) #2

In addition to my previous question, is it correct that version 0.90.11
only pings one port, instead of the entire 9300-9400 range?

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:

Hi there,

We're having issues with a cluster that fails to connect to it's master
repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[
b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting
for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to[[#zen_unicast_1#][inet[
b001.my-cluster.com/85.17.231.xxx:9300]http://b001.my-cluster.com/85.17.231.xxx:9300]
]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[
b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicasthttp://b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast]
request_id [18] timed out after [3750ms]
at
org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]},
publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #3

Is the master node under pressure? I meant pressure on memory (JVM old generation GC).

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:30, Robin Verlangen robin@us2.nl a écrit :

Hi there,

We're having issues with a cluster that fails to connect to it's master repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to [[#zen_unicast_1#][inet[b001.my-cluster.com/85.17.231.xxx:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast] request_id [18] timed out after [3750ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9k9gFXtv5qbxuAnoZ%3Di68Vk9RjTmAfyvUve899nddfTg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3832A79E-CCFB-43D3-8139-E0090BC24508%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #4

Ping will ping one port.
If you did not set port in unicast list, 9300 is assumed I guess.

Modify elasticsearch.yml file and set the "right" port for this node.

HTH

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:36, Robin Verlangen robin@us2.nl a écrit :

In addition to my previous question, is it correct that version 0.90.11 only pings one port, instead of the entire 9300-9400 range?

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:
Hi there,

We're having issues with a cluster that fails to connect to it's master repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to [[#zen_unicast_1#][inet[b001.my-cluster.com/85.17.231.xxx:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast] request_id [18] timed out after [3750ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/15BAFA91-F0DB-4B7A-A20D-E811872B0B6F%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


(Robin Verlangen) #5

Hi David,

Ok, I think the latter (the port range) is the actual issue. Would you be
able to think with me on a possible solution?

We run 3 ES applications per node, each with 2-4 ES clients. Would it then
be best to specify
b001.my-cluster.com[9300-9312],b002.my-cluster.com[9300-9312]
et cetera?

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:46 AM, David Pilato david@pilato.fr wrote:

Ping will ping one port.
If you did not set port in unicast list, 9300 is assumed I guess.

Modify elasticsearch.yml file and set the "right" port for this node.

HTH

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:36, Robin Verlangen robin@us2.nl a écrit :

In addition to my previous question, is it correct that version 0.90.11
only pings one port, instead of the entire 9300-9400 range?

https://github.com/elasticsearch/elasticsearch/blob/a4b2366e1e50953b7308b21963133dc50dd3fc60/src/main/java/org/elasticsearch/discovery/zen/ping/unicast/UnicastZenPing.java#L112

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:

Hi there,

We're having issues with a cluster that fails to connect to it's master
repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[
b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting
for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to[[#zen_unicast_1#][inet[
b001.my-cluster.com/85.17.231.xxx:9300]http://b001.my-cluster.com/85.17.231.xxx:9300]
]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[
b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicasthttp://b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast]
request_id [18] timed out after [3750ms]
at
org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]},
publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/15BAFA91-F0DB-4B7A-A20D-E811872B0B6F%40pilato.fr
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB_xy%2BZoOhma5t7O0YG7rbh4WM910-jNeU7OqFJXBkgaPw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #6

Not sure about your architecture. May be you have good reasons for that but running more than one node per machine is not what I'd recommend.
But here, may be they are "client only nodes"?

Specifying port range is OK. When your node starts, it tries to ping b001:9300, then b001:9301, …
One node should at least answer otherwise, current node will think it's alone and will set itself as master.

Once the cluster state is get from master node to the current node, current node knows exactly which nodes forms the cluster and on which port for each node.
So, when pinging, only the right port is pinged. If the expected node does not answer to ping request, it will be considered as leaving the cluster.

Makes sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 14 février 2014 à 08:51:47, Robin Verlangen (robin@us2.nl) a écrit:

Hi David,

Ok, I think the latter (the port range) is the actual issue. Would you be able to think with me on a possible solution?

We run 3 ES applications per node, each with 2-4 ES clients. Would it then be best to specify b001.my-cluster.com[9300-9312],b002.my-cluster.com[9300-9312] et cetera?

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:46 AM, David Pilato david@pilato.fr wrote:
Ping will ping one port.
If you did not set port in unicast list, 9300 is assumed I guess.

Modify elasticsearch.yml file and set the "right" port for this node.

HTH

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:36, Robin Verlangen robin@us2.nl a écrit :

In addition to my previous question, is it correct that version 0.90.11 only pings one port, instead of the entire 9300-9400 range?

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:
Hi there,

We're having issues with a cluster that fails to connect to it's master repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to [[#zen_unicast_1#][inet[b001.my-cluster.com/85.17.231.xxx:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast] request_id [18] timed out after [3750ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]], reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

What is CloudPelican?

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/15BAFA91-F0DB-4B7A-A20D-E811872B0B6F%40pilato.fr.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB_xy%2BZoOhma5t7O0YG7rbh4WM910-jNeU7OqFJXBkgaPw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52fdceda.6b8b4567.17b2%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


(Robin Verlangen) #7

Hi David,

Yes, that makes sense. Thank you for helping me out.

Some background on the architecture, there is one ES server, and 2 other
applications that are client only. However the order of starting these
services is random (e.g. it is unsure which service picks the 9300 port).

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 9:07 AM, David Pilato david@pilato.fr wrote:

Not sure about your architecture. May be you have good reasons for that
but running more than one node per machine is not what I'd recommend.
But here, may be they are "client only nodes"?

Specifying port range is OK. When your node starts, it tries to ping
b001:9300, then b001:9301, ...
One node should at least answer otherwise, current node will think it's
alone and will set itself as master.

Once the cluster state is get from master node to the current node,
current node knows exactly which nodes forms the cluster and on which port
for each node.
So, when pinging, only the right port is pinged. If the expected node does
not answer to ping request, it will be considered as leaving the cluster.

Makes sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 14 février 2014 à 08:51:47, Robin Verlangen (robin@us2.nl) a écrit:

Hi David,

Ok, I think the latter (the port range) is the actual issue. Would you be
able to think with me on a possible solution?

We run 3 ES applications per node, each with 2-4 ES clients. Would it then
be best to specify b001.my-cluster.com[9300-9312],b002.my-cluster.com[9300-9312]
et cetera?

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:46 AM, David Pilato david@pilato.fr wrote:

Ping will ping one port.
If you did not set port in unicast list, 9300 is assumed I guess.

Modify elasticsearch.yml file and set the "right" port for this node.

HTH

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 14 févr. 2014 à 08:36, Robin Verlangen robin@us2.nl a écrit :

In addition to my previous question, is it correct that version
0.90.11 only pings one port, instead of the entire 9300-9400 range?

https://github.com/elasticsearch/elasticsearch/blob/a4b2366e1e50953b7308b21963133dc50dd3fc60/src/main/java/org/elasticsearch/discovery/zen/ping/unicast/UnicastZenPing.java#L112

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

On Fri, Feb 14, 2014 at 8:30 AM, Robin Verlangen robin@us2.nl wrote:

Hi there,

We're having issues with a cluster that fails to connect to it's master
repeatedly. Please see the logs below:

INFO: [b002.my-cluster.com] failed to send join request to master [[
b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [org.elasticsearch.ElasticSearchTimeoutException: Timeout waiting
for task.]
Feb 14, 2014 7:23:47 AM org.elasticsearch.discovery.zen.ping.unicast
WARNING: [b002.my-cluster.com] failed to send ping to[[#zen_unicast_1#][inet[
b001.my-cluster.com/85.17.231.xxx:9300]http://b001.my-cluster.com/85.17.231.xxx:9300]
]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[
b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicasthttp://b001.my-cluster.com/85.17.231.xxx:9300]][discovery/zen/unicast]
request_id [18] timed out after [3750ms]
at
org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery.zen
INFO: [b002.my-cluster.com] master_left [[b005.my-cluster.com][Hpm2Z7AaR3ugg417majMQg][inet[/37.139.25.xxx:9302]]],
reason [do not exists on master, act as master failure]
Feb 14, 2014 7:23:49 AM org.elasticsearch.discovery
INFO: [b002.my-cluster.com] my-cluster-001/Vjs0tUn7QTq8oDN2F0PxQQ
Feb 14, 2014 7:23:49 AM org.elasticsearch.http
INFO: [b002.my-cluster.com] bound_address
{inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/37.139.5.xxx:9200]}
Feb 14, 2014 7:23:49 AM org.elasticsearch.node
INFO: [b002.my-cluster.com] started

Best regards,

Robin Verlangen
Chief Data Architect

W http://www.robinverlangen.nl
E robin@us2.nl

http://goo.gl/Lt7BC
What is CloudPelican? http://goo.gl/HkB3D

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADVHTB9dN0Po52FZSuFi-1zuustneKNr0%2BJiVKpgXJjx%3D6k74g%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/15BAFA91-F0DB-4B7A-A20D-E811872B0B6F%40pilato.fr.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADVHTB_xy%2BZoOhma5t7O0YG7rbh4WM910-jNeU7OqFJXBkgaPw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.52fdceda.6b8b4567.17b2%40MacBook-Air-de-David.local
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADVHTB-EAPGEiPCbUcdaBmTh_%2BHBda%2BgWMi3YZ1v39dujocwzQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #8