Master node disconnects from cluster


(Boyd Meier) #1

My cluster periodically gets in a weird state where one of the nodes
(usually the master) gets disconnected from the cluster, which causes the
cluster to be partitioned. The remaining nodes elect a new master, and
sometimes recover without a hiccup, but the original master still thinks it
is the master and is still connected to all the remaining nodes.
I have a script pinging all the nodes with es health every minute, and the
trace of the statuses is interesting:

2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node16 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node20 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node19 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node17 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node13 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node21 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node12 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<only five nodes respond at 03:02, but they all report green with 8 nodes>
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node17
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node19
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<something happens, and two nodes are reporting 7 nodes in the cluster>
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node16
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node13
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node21
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:04:08 ESClusterName green 8 8 695 1390 0 0 0 node14
2013-09-04 00:04:08 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:04:10 ESClusterName yellow 7 7 695 1220 0 14 156 node17

<now we are in the partitioned state, and node20 is off in neverland>
2013-09-04 00:05:03 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node16
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node19
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node13
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node17
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node14
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node21
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node12


<and now we're in the partitioned state, ad infinitum>
2013-09-04 00:10:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node16
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node13
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node17
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node21
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node14
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:10:05 ESClusterName green 7 7 695 1390 0 0 0 node12

The log file for the missing master contains many instances of errors like
the following, but the really odd thing is that the ip address mentioned in
the message is for another node in the cluster (the master was
172.31.12.20).

[2013-09-04 00:01:07,781][WARN ][discovery.zen.ping.multicast]
[node20.example.com] failed to connect to requesting node
[localhost][SprCZQI6TDuyDEhikklrUg][inet[/172.31.12.12:9301]]{client=true,
data=false}
org.elasticsearch.transport.ConnectTransportException:
[localhost][inet[/172.31.12.12:9301]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:673)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:608)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:578)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:128)
at
org.elasticsearch.discovery.zen.ping.multicast.MulticastZenPing$Receiver$2.run(MulticastZenPing.java:539)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)

The other nodes all have something similar to this:
[2013-09-04 00:04:08,164][INFO ][discovery.zen ]
[node12.example.com] master_left
[[node20.example.com][hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]]],
reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2013-09-04 00:04:08,208][INFO ][cluster.service ]
[node12.example.com] master {new
[node17.example.com][6lA0f-nmR8etT6GehitTmA][inet[/172.31.12.17:9300]],
previous
[node20.example.com][hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]]},
removed
{[node20.example.com][hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]],},
reason: zen-disco-receive(from master
[[node17.example.com][6lA0f-nmR8etT6GehitTmA][inet[/172.31.12.17:9300]]])

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Runar Myklebust) #2

Which version are you running? In 0.20.6, I have seen some strange states
when a node disconnect and a new master is elected, but in 0.90.3 the state
is more as expected (that is, still split-brains, but the odd one node is
member of two master nodes are gone)

The ping timeout indicates network or node not responding. Are there any
heavy GC going on on the nodes?

On Wed, Sep 4, 2013 at 8:09 AM, Boyd Meier boyd.meier@gmail.com wrote:

My cluster periodically gets in a weird state where one of the nodes
(usually the master) gets disconnected from the cluster, which causes the
cluster to be partitioned. The remaining nodes elect a new master, and
sometimes recover without a hiccup, but the original master still thinks it
is the master and is still connected to all the remaining nodes.
I have a script pinging all the nodes with es health every minute, and the
trace of the statuses is interesting:

2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node16 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node20 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node19 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node17 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node13 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node21 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node12 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<only five nodes respond at 03:02, but they all report green with 8 nodes>
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node17
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node19
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<something happens, and two nodes are reporting 7 nodes in the cluster>
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node16
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node13
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node21
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:04:08 ESClusterName green 8 8 695 1390 0 0 0 node14
2013-09-04 00:04:08 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:04:10 ESClusterName yellow 7 7 695 1220 0 14 156 node17

<now we are in the partitioned state, and node20 is off in neverland>
2013-09-04 00:05:03 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node16
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node19
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node13
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node17
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node14
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node21
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node12


<and now we're in the partitioned state, ad infinitum>
2013-09-04 00:10:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node16
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node13
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node17
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node21
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node14
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:10:05 ESClusterName green 7 7 695 1390 0 0 0 node12

The log file for the missing master contains many instances of errors like
the following, but the really odd thing is that the ip address mentioned in
the message is for another node in the cluster (the master was
172.31.12.20).

[2013-09-04 00:01:07,781][WARN ][discovery.zen.ping.multicast] [
node20.example.com] failed to connect to requesting node
[localhost][SprCZQI6TDuyDEhikklrUg][inet[/172.31.12.12:9301]]{client=true,
data=false}
org.elasticsearch.transport.ConnectTransportException: [localhost][inet[/
172.31.12.12:9301]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:673)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:608)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:578)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:128)
at
org.elasticsearch.discovery.zen.ping.multicast.MulticastZenPing$Receiver$2.run(MulticastZenPing.java:539)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)

The other nodes all have something similar to this:
[2013-09-04 00:04:08,164][INFO ][discovery.zen ] [
node12.example.com] master_left [[node20.example.com
][hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]]], reason [failed to
ping, tried [3] times, each with maximum [30s] timeout]
[2013-09-04 00:04:08,208][INFO ][cluster.service ] [
node12.example.com] master {new [node17.example.com
][6lA0f-nmR8etT6GehitTmA][inet[/172.31.12.17:9300]], previous [
node20.example.com][hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]]},
removed {[node20.example.com
][hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]],}, reason:
zen-disco-receive(from master [[node17.example.com
][6lA0f-nmR8etT6GehitTmA][inet[/172.31.12.17:9300]]])

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
mvh

Runar Myklebust

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Boyd Meier) #3

Sorry, meant to include the version. I'm using 0.90.3. The jvm monitoring
doesn't indicate a heavy GC, though it's still possible that something like
that is happening. It sounds like you have a case where you have
configured multiple masters, how did you do that?

On Wednesday, September 4, 2013 7:56:46 AM UTC-5, Runar Myklebust wrote:

Which version are you running? In 0.20.6, I have seen some strange states
when a node disconnect and a new master is elected, but in 0.90.3 the state
is more as expected (that is, still split-brains, but the odd one node is
member of two master nodes are gone)

The ping timeout indicates network or node not responding. Are there any
heavy GC going on on the nodes?

On Wed, Sep 4, 2013 at 8:09 AM, Boyd Meier <boyd....@gmail.com<javascript:>

wrote:

My cluster periodically gets in a weird state where one of the nodes
(usually the master) gets disconnected from the cluster, which causes the
cluster to be partitioned. The remaining nodes elect a new master, and
sometimes recover without a hiccup, but the original master still thinks it
is the master and is still connected to all the remaining nodes.
I have a script pinging all the nodes with es health every minute, and
the trace of the statuses is interesting:

2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node16 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node20 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node19 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node17 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node13 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node21 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node12 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<only five nodes respond at 03:02, but they all report green with 8 nodes>
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node17
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node19
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<something happens, and two nodes are reporting 7 nodes in the cluster>
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node16
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node13
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node21
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:04:08 ESClusterName green 8 8 695 1390 0 0 0 node14
2013-09-04 00:04:08 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:04:10 ESClusterName yellow 7 7 695 1220 0 14 156 node17

<now we are in the partitioned state, and node20 is off in neverland>
2013-09-04 00:05:03 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node16
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node19
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node13
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node17
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node14
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node21
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node12


<and now we're in the partitioned state, ad infinitum>
2013-09-04 00:10:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node16
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node13
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node17
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node21
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node14
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:10:05 ESClusterName green 7 7 695 1390 0 0 0 node12

The log file for the missing master contains many instances of errors
like the following, but the really odd thing is that the ip address
mentioned in the message is for another node in the cluster (the master was
172.31.12.20).

[2013-09-04 00:01:07,781][WARN ][discovery.zen.ping.multicast] [
node20.example.com] failed to connect to requesting node
[localhost][SprCZQI6TDuyDEhikklrUg][inet[/172.31.12.12:9301]]{client=true,
data=false}
org.elasticsearch.transport.ConnectTransportException: [localhost][inet[/
172.31.12.12:9301]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:673)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:608)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:578)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:128)
at
org.elasticsearch.discovery.zen.ping.multicast.MulticastZenPing$Receiver$2.run(MulticastZenPing.java:539)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)

The other nodes all have something similar to this:
[2013-09-04 00:04:08,164][INFO ][discovery.zen ] [
node12.example.com] master_left [[node20.example.com][hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]]],
reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2013-09-04 00:04:08,208][INFO ][cluster.service ] [
node12.example.com] master {new [node17.example.com][6lA0f-nmR8etT6GehitTmA][inet[/172.31.12.17:9300]],
previous [node20.example.com][hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]]},
removed {[node20.example.com][hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]],},
reason: zen-disco-receive(from master [[node17.example.com
][6lA0f-nmR8etT6GehitTmA][inet[/172.31.12.17:9300]]])

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
mvh

Runar Myklebust

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Runar Myklebust) #4

No, I didnt, but it happens in 0.20.6 when there where nodes disconnecting
and (because of customer requirements) minimum_master_nodes is set to 1.
Then, after a disconnected node elects itself to master and is reconnected,
there are suddenly nodes connected (as it seems at least) to two masters. I
have not seen this behaviour in 0.90.3 though.

Ive also seen disconnects of nodes with no apparent reason, asked in this
group, without ever getting to understand exactly what is happening, not
event with trace-logging.

On Wed, Sep 4, 2013 at 3:35 PM, Boyd Meier boyd.meier@gmail.com wrote:

Sorry, meant to include the version. I'm using 0.90.3. The jvm
monitoring doesn't indicate a heavy GC, though it's still possible that
something like that is happening. It sounds like you have a case where you
have configured multiple masters, how did you do that?

On Wednesday, September 4, 2013 7:56:46 AM UTC-5, Runar Myklebust wrote:

Which version are you running? In 0.20.6, I have seen some strange states
when a node disconnect and a new master is elected, but in 0.90.3 the state
is more as expected (that is, still split-brains, but the odd one node is
member of two master nodes are gone)

The ping timeout indicates network or node not responding. Are there any
heavy GC going on on the nodes?

On Wed, Sep 4, 2013 at 8:09 AM, Boyd Meier boyd....@gmail.com wrote:

My cluster periodically gets in a weird state where one of the nodes
(usually the master) gets disconnected from the cluster, which causes the
cluster to be partitioned. The remaining nodes elect a new master, and
sometimes recover without a hiccup, but the original master still thinks it
is the master and is still connected to all the remaining nodes.
I have a script pinging all the nodes with es health every minute, and
the trace of the statuses is interesting:

2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node16 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node20 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node19 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node17 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node13 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node21 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node12 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<only five nodes respond at 03:02, but they all report green with 8
nodes>
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node17
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node19
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<something happens, and two nodes are reporting 7 nodes in the cluster>
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node16
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node13
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node21
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:04:08 ESClusterName green 8 8 695 1390 0 0 0 node14
2013-09-04 00:04:08 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:04:10 ESClusterName yellow 7 7 695 1220 0 14 156 node17

<now we are in the partitioned state, and node20 is off in neverland>
2013-09-04 00:05:03 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node16
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node19
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node13
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node17
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node14
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node21
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node12


<and now we're in the partitioned state, ad infinitum>
2013-09-04 00:10:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node16
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node13
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node17
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node21
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node14
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:10:05 ESClusterName green 7 7 695 1390 0 0 0 node12

The log file for the missing master contains many instances of errors
like the following, but the really odd thing is that the ip address
mentioned in the message is for another node in the cluster (the master was
172.31.12.20).

[2013-09-04 00:01:07,781][WARN ][discovery.zen.ping.multicast] [
node20.example.com] failed to connect to requesting node [localhost][

SprCZQI6TDuyDEhikklrUg][inet[/**172.31.12.12:9301]]{client=**true,
data=false}
org.elasticsearch.transport.**ConnectTransportException:
[localhost][inet[/172.31.12.**12:9301 http://172.31.12.12:9301]]
connect_timeout[30s]
at org.elasticsearch.transport.netty.NettyTransport.
connectToChannels(**NettyTransport.java:673)
at org.elasticsearch.transport.netty.NettyTransport.
connectToNode(NettyTransport.**java:608)
at org.elasticsearch.transport.netty.NettyTransport.
connectToNode(NettyTransport.**java:578)
at org.elasticsearch.transport.TransportService.
connectToNode(**TransportService.java:128)
at org.elasticsearch.discovery.zen.ping.multicast.
MulticastZenPing$Receiver$2.**run(MulticastZenPing.java:539)
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.**java:679)

The other nodes all have something similar to this:
[2013-09-04 00:04:08,164][INFO ][discovery.zen ] [
node12.example.com] master_left [[node20.example.com][**
hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]]], reason [failed to
ping, tried [3] times, each with maximum [30s] timeout]
[2013-09-04 00:04:08,208][INFO ][cluster.service ] [
node12.example.com] master {new [node17.example.com][6lA0f-

nmR8etT6GehitTmA][inet[/172.**31.12.17:9300]], previous [
node20.example.com][**hhmHf3hrSz6tjSktUN4Krg][inet[/**172.31.12.20:9300]]},
removed {[node20.example.com][hhmHf3hrSz6tjSktUN4Krg][inet[/
172.31.12.20:9300]],}, reason: zen-disco-receive(from master [[
node17.example.com][6lA0f-nmR8etT6GehitTmA][inet[/172.
31.12.17:9300]]])

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
mvh

Runar Myklebust

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
mvh

Runar Myklebust

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Boyd Meier) #5

I've seen both of those behaviors in 0.90.1 recently. That's part of why I
bumped up to 0.90.3 last week. It seems like something is still wrong with
the discovery protocol, though.

On Wednesday, September 4, 2013 8:43:11 AM UTC-5, Runar Myklebust wrote:

No, I didnt, but it happens in 0.20.6 when there where nodes disconnecting
and (because of customer requirements) minimum_master_nodes is set to 1.
Then, after a disconnected node elects itself to master and is reconnected,
there are suddenly nodes connected (as it seems at least) to two masters. I
have not seen this behaviour in 0.90.3 though.

Ive also seen disconnects of nodes with no apparent reason, asked in this
group, without ever getting to understand exactly what is happening, not
event with trace-logging.

On Wed, Sep 4, 2013 at 3:35 PM, Boyd Meier <boyd....@gmail.com<javascript:>

wrote:

Sorry, meant to include the version. I'm using 0.90.3. The jvm
monitoring doesn't indicate a heavy GC, though it's still possible that
something like that is happening. It sounds like you have a case where you
have configured multiple masters, how did you do that?

On Wednesday, September 4, 2013 7:56:46 AM UTC-5, Runar Myklebust wrote:

Which version are you running? In 0.20.6, I have seen some strange
states when a node disconnect and a new master is elected, but in 0.90.3
the state is more as expected (that is, still split-brains, but the odd one
node is member of two master nodes are gone)

The ping timeout indicates network or node not responding. Are there any
heavy GC going on on the nodes?

On Wed, Sep 4, 2013 at 8:09 AM, Boyd Meier boyd....@gmail.com wrote:

My cluster periodically gets in a weird state where one of the nodes
(usually the master) gets disconnected from the cluster, which causes the
cluster to be partitioned. The remaining nodes elect a new master, and
sometimes recover without a hiccup, but the original master still thinks it
is the master and is still connected to all the remaining nodes.
I have a script pinging all the nodes with es health every minute, and
the trace of the statuses is interesting:

2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node16 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node20 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node19 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node17 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node13 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node21 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node12 2013-09-04 00:02:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<only five nodes respond at 03:02, but they all report green with 8
nodes>
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node17
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node19
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:03:02 ESClusterName green 8 8 695 1390 0 0 0 node14

<something happens, and two nodes are reporting 7 nodes in the cluster>
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node16
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node13
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node21
2013-09-04 00:04:02 ESClusterName green 8 8 695 1390 0 0 0 node12
2013-09-04 00:04:08 ESClusterName green 8 8 695 1390 0 0 0 node14
2013-09-04 00:04:08 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:04:10 ESClusterName yellow 7 7 695 1220 0 14 156 node17

<now we are in the partitioned state, and node20 is off in neverland>
2013-09-04 00:05:03 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node16
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node19
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node13
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node17
2013-09-04 00:05:03 ESClusterName yellow 7 7 695 1285 0 14 91 node14
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node21
2013-09-04 00:05:04 ESClusterName yellow 7 7 695 1286 0 14 90 node12


<and now we're in the partitioned state, ad infinitum>
2013-09-04 00:10:02 ESClusterName green 8 8 695 1390 0 0 0 node20
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node16
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node13
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node17
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node21
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node14
2013-09-04 00:10:03 ESClusterName green 7 7 695 1390 0 0 0 node19
2013-09-04 00:10:05 ESClusterName green 7 7 695 1390 0 0 0 node12

The log file for the missing master contains many instances of errors
like the following, but the really odd thing is that the ip address
mentioned in the message is for another node in the cluster (the master was
172.31.12.20).

[2013-09-04 00:01:07,781][WARN ][discovery.zen.ping.multicast] [
node20.example.com] failed to connect to requesting node [localhost][

SprCZQI6TDuyDEhikklrUg][inet[/**172.31.12.12:9301]]{client=**true,
data=false}
org.elasticsearch.transport.**ConnectTransportException:
[localhost][inet[/172.31.12.**12:9301 http://172.31.12.12:9301]]
connect_timeout[30s]
at org.elasticsearch.transport.netty.NettyTransport.
connectToChannels(**NettyTransport.java:673)
at org.elasticsearch.transport.netty.NettyTransport.
connectToNode(NettyTransport.**java:608)
at org.elasticsearch.transport.netty.NettyTransport.
connectToNode(NettyTransport.**java:578)
at org.elasticsearch.transport.TransportService.
connectToNode(**TransportService.java:128)
at org.elasticsearch.discovery.zen.ping.multicast.
MulticastZenPing$Receiver$2.**run(MulticastZenPing.java:539)
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.**java:679)

The other nodes all have something similar to this:
[2013-09-04 00:04:08,164][INFO ][discovery.zen ] [
node12.example.com] master_left [[node20.example.com][**
hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]]], reason [failed
to ping, tried [3] times, each with maximum [30s] timeout]
[2013-09-04 00:04:08,208][INFO ][cluster.service ] [
node12.example.com] master {new [node17.example.com][6lA0f-

nmR8etT6GehitTmA][inet[/172.**31.12.17:9300]], previous [
node20.example.com][**hhmHf3hrSz6tjSktUN4Krg][inet[/**172.31.12.20:9300]]},
removed {[node20.example.com][**hhmHf3hrSz6tjSktUN4Krg][inet[/172.31.12.20:9300]],},
reason: zen-disco-receive(from master [[node17.example.com][6lA0f-

nmR8etT6GehitTmA][inet[/172.**31.12.17:9300]]])

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
mvh

Runar Myklebust

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
mvh

Runar Myklebust

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6