Split brain on 0.17.8 -> 0.18.7 upgrade with unicast

AEvar_Arnfjord_Bjarm · January 11, 2012, 11:53am

We upgraded two clusters from 0.17.8 to 0.18.7 that were using unicast
(due to network restrictions) and ended up in a split-brain
situation. The nodes would only discover other nodes with the same
version as themselves.

The node that was upgraded to 0.18.7 first started spewing these
errors it its error log when it came back up:

[2012-01-11 10:03:53,674][INFO ][node                     ]

[search-01] {elasticsearch/0.17.8}[19888]: stopping ...
[2012-01-11 10:03:53,929][INFO ][node ]
[search-01] {elasticsearch/0.17.8}[19888]: stopped
[2012-01-11 10:03:53,929][INFO ][node ]
[search-01] {elasticsearch/0.17.8}[19888]: closing ...
[2012-01-11 10:03:54,228][INFO ][node ]
[search-01] {elasticsearch/0.17.8}[19888]: closed
[2012-01-11 10:05:49,512][INFO ][node ]
[search-01] {0.18.7}[7135]: initializing ...
[2012-01-11 10:05:49,519][INFO ][plugins ]
[search-01] loaded [], sites [bigdesk, head]
[2012-01-11 10:05:50,776][INFO ][node ]
[search-01] {0.18.7}[7135]: initialized
[2012-01-11 10:05:50,776][INFO ][node ]
[search-01] {0.18.7}[7135]: starting ...
[2012-01-11 10:05:50,836][INFO ][transport ]
[search-01] bound_address {inet[/0:0:0:0:0:0:0:0:9300]},
publish_address {inet[/10.147.174.140:9300]}
[2012-01-11 10:05:50,970][WARN ][discovery.zen.ping.unicast]
[search-01] failed to send ping to
[[#zen_unicast_4#][inet[search-04.example.com/10.147.176.146:9300]]]
org.elasticsearch.transport.RemoteTransportException:
[search-04][inet[/10.147.176.146:9300]][discovery/zen/unicast]
Caused by: java.io.EOFException
at org.elasticsearch.common.io.stream.StreamInput.readBoolean(StreamInput.java:229)
at org.elasticsearch.discovery.zen.ping.ZenPing$PingResponse.readFrom(ZenPing.java:89)
at org.elasticsearch.discovery.zen.ping.ZenPing$PingResponse.readPingResponse(ZenPing.java:82)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$UnicastPingRequest.readFrom(UnicastZenPing.java:400)
at org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:182)
at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:87)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:783)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:302)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:317)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:299)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:216)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:783)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:65)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

Then as other upgraded nodes join the cluster on 0.18.7 it starts
discovering them:

[2012-01-11 10:05:53,898][INFO ][cluster.service          ]

[search-01] new_master
[search-01][M3fXnCk3TxS6T-GPjfgYQg][inet[/10.147.174.140:9300]],
reason: zen-disco-join (elected_as_master)
[2012-01-11 10:05:53,918][INFO ][discovery ]
[search-01] example/M3fXnCk3TxS6T-GPjfgYQg
[2012-01-11 10:05:54,055][INFO ][http ]
[search-01] bound_address {inet[/0:0:0:0:0:0:0:0:9200]},
publish_address {inet[/10.147.174.140:9200]}
[2012-01-11 10:05:54,055][INFO ][node ]
[search-01] {0.18.7}[7135]: started
[2012-01-11 10:05:54,078][INFO ][gateway ]
[search-01] recovered [2] indices into cluster_state
[2012-01-11 10:05:54,080][INFO ][cluster.metadata ]
[search-01] updating number_of_replicas to [0] for indices
[search_engine_20120110_104503]
[2012-01-11 10:05:54,955][INFO ][cluster.metadata ]
[search-01] [search_engine_20120110_104503] auto expanded replicas to
[0]
[2012-01-11 10:05:54,957][INFO ][cluster.metadata ]
[search-01] updating number_of_replicas to [0] for indices
[search_engine_20120109_104503]
[2012-01-11 10:05:54,961][INFO ][cluster.metadata ]
[search-01] [search_engine_20120109_104503] auto expanded replicas to
[0]
[2012-01-11 10:05:54,962][INFO ][cluster.metadata ]
[search-01] updating number_of_replicas to [0] for indices
[search_engine_20120109_104503]
[2012-01-11 10:05:54,966][INFO ][cluster.metadata ]
[search-01] [search_engine_20120109_104503] auto expanded replicas to
[0]
[2012-01-11 10:06:08,316][INFO ][cluster.service ]
[search-01] added
{[search-02][exui0NkQQNyjngxQlyX4PQ][inet[/10.147.176.140:9300]],},
reason: zen-disco-receive(join from
node[[search-02][exui0NkQQNyjngxQlyX4PQ][inet[/10.147.176.140:9300]]])
[2012-01-11 10:06:08,320][INFO ][cluster.metadata ]
[search-01] updating number_of_replicas to [1] for indices
[search_engine_20120110_104503]
[2012-01-11 10:06:08,382][INFO ][cluster.metadata ]
[search-01] [search_engine_20120110_104503] auto expanded replicas to
[1]
[2012-01-11 10:06:08,382][INFO ][cluster.metadata ]
[search-01] updating number_of_replicas to [1] for indices
[search_engine_20120109_104503]
[2012-01-11 10:06:08,387][INFO ][cluster.metadata ]
[search-01] [search_engine_20120109_104503] auto expanded replicas to
[1]
[2012-01-11 10:06:08,388][INFO ][cluster.metadata ]
[search-01] updating number_of_replicas to [1] for indices
[search_engine_20120109_104503]
[2012-01-11 10:06:08,392][INFO ][cluster.metadata ]
[search-01] [search_engine_20120109_104503] auto expanded replicas to
[1]
[2012-01-11 10:07:44,837][INFO ][cluster.service ]
[search-01] added
{[search-03][xG_BCl-uTz69cwBslygwsQ][inet[/10.147.174.142:9300]],},
reason: zen-disco-receive(join from
node[[search-03][xG_BCl-uTz69cwBslygwsQ][inet[/10.147.174.142:9300]]])
[2012-01-11 10:07:44,839][INFO ][cluster.metadata ]
[search-01] updating number_of_replicas to [2] for indices
[search_engine_20120110_104503]
[2012-01-11 10:07:44,896][INFO ][cluster.metadata ]
[search-01] [search_engine_20120110_104503] auto expanded replicas to
[2]
[2012-01-11 10:07:44,897][INFO ][cluster.metadata ]
[search-01] updating number_of_replicas to [2] for indices
[search_engine_20120109_104503]
[2012-01-11 10:07:44,902][INFO ][cluster.metadata ]
[search-01] [search_engine_20120109_104503] auto expanded replicas to
[2]
[2012-01-11 10:07:44,902][INFO ][cluster.metadata ]
[search-01] updating number_of_replicas to [2] for indices
[search_engine_20120109_104503]
[2012-01-11 10:07:44,907][INFO ][cluster.metadata ]
[search-01] [search_engine_20120109_104503] auto expanded replicas to
[2]
[2012-01-11 10:07:45,651][DEBUG][action.admin.cluster.node.info]
[search-01] failed to execute on node [xG_BCl-uTz69cwBslygwsQ]

Meanwhile the nodes still on 0.17.8 show "Message not fully read"
messages before they too go down for a 0.18.7 upgrade:

[2012-01-11 10:03:53,928][INFO ][cluster.service          ]

[search-03] removed
{[search-01][VGeQKtjoT4axRLKI5K_EWQ][inet[/10.147.174.140:9300]],},
reason: zen-disco-receive(from master
[[search-02][LWPgo4rHRASq1nrCI2ZRAA][inet[/10.147.176.140:9300]]])
[2012-01-11 10:04:13,164][INFO ][discovery.zen ]
[search-03] master_left
[[search-02][LWPgo4rHRASq1nrCI2ZRAA][inet[/10.147.176.140:9300]]],
reason [shut_down]
[2012-01-11 10:04:13,180][INFO ][cluster.service ]
[search-03] master {new
[search-03][5wOHm0i8SmeVqL5JkRSaFQ][inet[/10.147.174.142:9300]],
previous [search-02][LWPgo4rHRASq1nrCI2ZRAA][inet[/10.147.176.140:9300]]},
removed {[search-02][LWPgo4rHRASq1nrCI2ZRAA][inet[/10.147.176.140:9300]],},
reason: zen-disco-master_failed
([search-02][LWPgo4rHRASq1nrCI2ZRAA][inet[/10.147.176.140:9300]])
[2012-01-11 10:04:13,197][INFO ][cluster.metadata ]
[search-03] updating number_of_replicas to [1] for indices
[search_engine_20120110_104503]
[2012-01-11 10:04:13,200][INFO ][cluster.metadata ]
[search-03] [search_engine_20120110_104503] auto expanded replicas to
[1]
[2012-01-11 10:04:13,200][INFO ][cluster.metadata ]
[search-03] updating number_of_replicas to [1] for indices
[search_engine_20120109_104503]
[2012-01-11 10:04:13,202][INFO ][cluster.metadata ]
[search-03] [search_engine_20120109_104503] auto expanded replicas to
[1]
[2012-01-11 10:04:13,202][INFO ][cluster.metadata ]
[search-03] updating number_of_replicas to [1] for indices
[search_engine_20120110_104503]
[2012-01-11 10:04:13,204][INFO ][cluster.metadata ]
[search-03] [search_engine_20120110_104503] auto expanded replicas to
[1]
[2012-01-11 10:04:13,204][INFO ][cluster.metadata ]
[search-03] updating number_of_replicas to [1] for indices
[search_engine_20120109_104503]
[2012-01-11 10:04:13,205][INFO ][cluster.metadata ]
[search-03] [search_engine_20120109_104503] auto expanded replicas to
[1]
[2012-01-11 10:04:13,205][INFO ][cluster.metadata ]
[search-03] updating number_of_replicas to [1] for indices
[search_engine_20120109_104503]
[2012-01-11 10:04:13,207][INFO ][cluster.metadata ]
[search-03] [search_engine_20120109_104503] auto expanded replicas to
[1]
[2012-01-11 10:05:50,946][WARN ][transport.netty ]
[search-03] Message not fully read (request) for [0] and action
[discovery/zen/unicast], resetting
[2012-01-11 10:05:52,380][WARN ][transport.netty ]
[search-03] Message not fully read (request) for [4] and action
[discovery/zen/unicast], resetting
[2012-01-11 10:05:53,884][WARN ][transport.netty ]
[search-03] Message not fully read (request) for [7] and action
[discovery/zen/unicast], resetting
[2012-01-11 10:06:05,333][WARN ][transport.netty ]
[search-03] Message not fully read (request) for [1] and action
[discovery/zen/unicast], resetting
[2012-01-11 10:06:06,783][WARN ][transport.netty ]
[search-03] Message not fully read (request) for [6] and action
[discovery/zen/unicast], resetting
[2012-01-11 10:06:08,288][WARN ][transport.netty ]
[search-03] Message not fully read (request) for [10] and action
[discovery/zen/unicast], resetting
[2012-01-11 10:06:48,924][INFO ][node ]
[search-03] {elasticsearch/0.17.8}[31667]: stopping ...
[2012-01-11 10:06:49,174][INFO ][node ]
[search-03] {elasticsearch/0.17.8}[31667]: stopped
[2012-01-11 10:06:49,175][INFO ][node ]
[search-03] {elasticsearch/0.17.8}[31667]: closing ...
[2012-01-11 10:06:49,587][INFO ][node ]
[search-03] {elasticsearch/0.17.8}[31667]: closed
[2012-01-11 10:07:40,405][INFO ][node ]
[search-03] {0.18.7}[31045]: initializing ...
[2012-01-11 10:07:40,413][INFO ][plugins ]
[search-03] loaded [], sites [bigdesk, head]
[2012-01-11 10:07:41,712][INFO ][node ]
[search-03] {0.18.7}[31045]: initialized
[2012-01-11 10:07:41,713][INFO ][node ]
[search-03] {0.18.7}[31045]: starting ...
[2012-01-11 10:07:41,769][INFO ][transport ]
[search-03] bound_address {inet[/0:0:0:0:0:0:0:0:9300]},
publish_address {inet[/10.147.174.142:9300]}

Anyway I'll see it if happens again when we upgrade. It didn't happen
with upgrades within 0.17.*, but due to the way our cluster is setup
(we aren't populating data all the time) having a split brain for a
bit is not that big of a deal.

Clinton_Gormley · January 11, 2012, 12:00pm

Hi Ãvar

On Wed, 2012-01-11 at 12:53 +0100, Ãvar ArnfjÃ¶rÃ° Bjarmason wrote:

We upgraded two clusters from 0.17.8 to 0.18.7 that were using unicast
(due to network restrictions) and ended up in a split-brain
situation. The nodes would only discover other nodes with the same
version as themselves.

Yes - while you can sometimes do a rolling upgrade on "major" version
numbers, the general advice is not to do so as they are not guaranteed
to be able to talk to each other.

clint

AEvar_Arnfjord_Bjarm · January 11, 2012, 12:28pm

On Wed, Jan 11, 2012 at 13:00, Clinton Gormley clint@traveljury.com wrote:

Hi Ævar

On Wed, 2012-01-11 at 12:53 +0100, Ævar Arnfjörð Bjarmason wrote:

We upgraded two clusters from 0.17.8 to 0.18.7 that were using unicast
(due to network restrictions) and ended up in a split-brain
situation. The nodes would only discover other nodes with the same
version as themselves.

Yes - while you can sometimes do a rolling upgrade on "major" version
numbers, the general advice is not to do so as they are not guaranteed
to be able to talk to each other.

So what is the general advice for doing upgrades across major
revisions?

kimchy · January 11, 2012, 12:31pm

Full cluster restart. Shutdown the cluster, upgrade, and then start it with
the new version.

On Wed, Jan 11, 2012 at 2:28 PM, Ævar Arnfjörð Bjarmason
avarab@gmail.comwrote:

On Wed, Jan 11, 2012 at 13:00, Clinton Gormley clint@traveljury.com
wrote:

Hi Ævar

On Wed, 2012-01-11 at 12:53 +0100, Ævar Arnfjörð Bjarmason wrote:

We upgraded two clusters from 0.17.8 to 0.18.7 that were using unicast
(due to network restrictions) and ended up in a split-brain
situation. The nodes would only discover other nodes with the same
version as themselves.

Yes - while you can sometimes do a rolling upgrade on "major" version
numbers, the general advice is not to do so as they are not guaranteed
to be able to talk to each other.

So what is the general advice for doing upgrades across major
revisions?

AEvar_Arnfjord_Bjarm · January 11, 2012, 12:48pm

On Wed, Jan 11, 2012 at 13:31, Shay Banon kimchy@gmail.com wrote:

Full cluster restart. Shutdown the cluster, upgrade, and then start it with
the new version.

And if you want to service search requests without that downtime?

I guess the only viable way to do that is to:

Stop anything that may be adding new data to the cluster
Start a rolling restart, suffer the split brain
Once everything's restarted we'll have a unified brain with no conflicts
Restart whatever's adding data to the cluster

kimchy · January 11, 2012, 6:12pm

Yea, that would work, though I suggest stopping half the cluster (and
disable reallocation since you are going to start it back up) using the
cluster update settings API (
Elasticsearch Platform — Find real-time answers at scale | Elastic)
with the disable_allocation setting. One the half cluster that is down,
start the new version, and then restart the other nodes with the old
version.

p.s. Your use of split brain is not really appropriate in this case. Since
you are causing it by using two different major versions of elasticsearch
in the same cluster.

On Wed, Jan 11, 2012 at 2:48 PM, Ævar Arnfjörð Bjarmason
avarab@gmail.comwrote:

On Wed, Jan 11, 2012 at 13:31, Shay Banon kimchy@gmail.com wrote:

Full cluster restart. Shutdown the cluster, upgrade, and then start it
with
the new version.

And if you want to service search requests without that downtime?

I guess the only viable way to do that is to:

Stop anything that may be adding new data to the cluster

Start a rolling restart, suffer the split brain

Once everything's restarted we'll have a unified brain with no
conflicts

Restart whatever's adding data to the cluster

Topic		Replies	Views
Upgrade cluster from 0.17.8 to 0.18.1 nuwer node unable to join Elasticsearch	3	304	July 6, 2017
Split brain? Elasticsearch	8	552	July 6, 2017
Unicast not working after upgrade to 20.2 Elasticsearch	7	420	July 6, 2017
Errors upgrading from es 0.17.7 to 0.18.2 Elasticsearch	3	343	July 6, 2017
Upgrade client and server from 0.16.x to 0.17.x Elasticsearch	7	393	July 6, 2017

Split brain on 0.17.8 -> 0.18.7 upgrade with unicast

Related topics