Data corruption after add/remove node to unicast cluster

Steff · October 18, 2011, 1:11pm

Hi

We have made a simple test on rebalacing of shards.

Start-state:
One index with 3 shards (1 replica)
Two nodes running (having Node1, Node2 and Node3 in unicast list):
Node1 running primary of shard1, primary of shard2 and replica of
shard3
Node2 running primary of shard3, replica of shard1 and replica of
shard2

Action:
We start a new node (Node3 - also having Node1, Node2 and Node3 in its
unicast list) that joins the cluster

End-state (after rebalancing has finished)
Three nodes:
Node1 running primary of shard1 and primary of shard2
Node2 running primary of shard3
Node3 running replica of shard1, replica of shard2 and replica of
shard3
Basically ALL replicas have been moved to the new node.

Again (as in https://groups.google.com/group/elasticsearch/browse_thread/thread/232fdc4e560d41d)
we think that this is a very strange rebalancing of shards that ES
decided to do. But this time there where even bigger problems.

We did another action:
Stopped the new node (Node3) again.

Now rebalancing the replicas back to the remaining nodes (Node1 and
Node2) start. After a while the exception shown below occurs on one of
the remaining nodes, and afterwards the index has been corrupted. Now,
no matter what we do (restart etc.), the cluster will not "accept" the
index again. We never get "contact to" the index again and the data
can be considered lost - this would be very bad in production.

I notice the OutOfMemoryError, but really that shouldnt happen and
indeed, if it happens, it shouldnt corrupt the index/data for good.
Any ideas about what to do? Solutions? Comments?

Regards, Per Steffensen
------------- exception ----------------------------
[2011-10-14 09:43:52,113][WARN ][transport.netty ] [Sybil
Dorn] Exception caught on netty layer [[id: 0x5dc433a2, /
192.168.88.240:60385 => /192.168.88.241:9300]]
java.lang.OutOfMemoryError: Java heap space
[2011-10-14 09:43:52,114][WARN ][transport.netty ] [Sybil
Dorn] Exception caught on netty layer [[id: 0x5dc433a2, /
192.168.88.240:60385 => /192.168.88.241:9300]]
java.io.StreamCorruptedException: invalid data length: 0
at
org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:
42)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:
282)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:
216)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:
80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:
783)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:
65)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:
274)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:
261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:
349)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:
280)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:
200)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:
108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker
$1.run(DeadLockProofWorker.java:44)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

kimchy · October 18, 2011, 6:19pm

Which version are you using? Do you still have the logs around for the test
run, and if so, can you gist / attach them? OOM should not cause data loss,
and the failure you posted seems like a communication problem that might
happen because of the OOM.

On Tue, Oct 18, 2011 at 3:11 PM, Steff steff@designware.dk wrote:

Hi

We have made a simple test on rebalacing of shards.

Start-state:
One index with 3 shards (1 replica)
Two nodes running (having Node1, Node2 and Node3 in unicast list):
Node1 running primary of shard1, primary of shard2 and replica of
shard3
Node2 running primary of shard3, replica of shard1 and replica of
shard2

Action:
We start a new node (Node3 - also having Node1, Node2 and Node3 in its
unicast list) that joins the cluster

End-state (after rebalancing has finished)
Three nodes:
Node1 running primary of shard1 and primary of shard2
Node2 running primary of shard3
Node3 running replica of shard1, replica of shard2 and replica of
shard3
Basically ALL replicas have been moved to the new node.

Again (as in
https://groups.google.com/group/elasticsearch/browse_thread/thread/232fdc4e560d41d
)
we think that this is a very strange rebalancing of shards that ES
decided to do. But this time there where even bigger problems.

We did another action:
Stopped the new node (Node3) again.

Now rebalancing the replicas back to the remaining nodes (Node1 and
Node2) start. After a while the exception shown below occurs on one of
the remaining nodes, and afterwards the index has been corrupted. Now,
no matter what we do (restart etc.), the cluster will not "accept" the
index again. We never get "contact to" the index again and the data
can be considered lost - this would be very bad in production.

I notice the OutOfMemoryError, but really that shouldnt happen and
indeed, if it happens, it shouldnt corrupt the index/data for good.
Any ideas about what to do? Solutions? Comments?

Regards, Per Steffensen
------------- exception ----------------------------
[2011-10-14 09:43:52,113][WARN ][transport.netty ] [Sybil
Dorn] Exception caught on netty layer [[id: 0x5dc433a2, /
192.168.88.240:60385 => /192.168.88.241:9300]]
java.lang.OutOfMemoryError: Java heap space
[2011-10-14 09:43:52,114][WARN ][transport.netty ] [Sybil
Dorn] Exception caught on netty layer [[id: 0x5dc433a2, /
192.168.88.240:60385 => /192.168.88.241:9300]]
java.io.StreamCorruptedException: invalid data length: 0
at

org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:
42)
at

org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:
282)
at

org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:
216)
at

org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:
80)
at

org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:
783)
at

org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:
65)
at

org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
564)
at

org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
559)
at

org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:
274)
at

org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:
261)
at

org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:
349)
at

org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:
280)
at

org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:
200)
at

org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:
108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker
$1.run(DeadLockProofWorker.java:44)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Steff · October 24, 2011, 5:43pm

On Oct 18, 8:19 pm, Shay Banon kim...@gmail.com wrote:

Which version are you using?

0.17.6

Do you still have the logs around for the test
run, and if so, can you gist / attach them?

Sorry, but they are lost.

OOM should not cause data loss,
and the failure you posted seems like a communication problem that might
happen because of the OOM.

Hopefully we will get around to repeat the test at some point in the
future, and I will make sure to collect logs and attach them here.

Regards, Per Steffensen

kimchy · October 24, 2011, 10:32pm

0.17.6 might be culprit, as an OOM related fix went into 0.17.7.

On Mon, Oct 24, 2011 at 7:43 PM, Steff steff@designware.dk wrote:

On Oct 18, 8:19 pm, Shay Banon kim...@gmail.com wrote:

Which version are you using?

0.17.6

Do you still have the logs around for the test
run, and if so, can you gist / attach them?

Sorry, but they are lost.

OOM should not cause data loss,
and the failure you posted seems like a communication problem that might
happen because of the OOM.

Hopefully we will get around to repeat the test at some point in the
future, and I will make sure to collect logs and attach them here.

Regards, Per Steffensen

Steff · October 26, 2011, 7:14am

On 25 Okt., 00:32, Shay Banon kim...@gmail.com wrote:

0.17.6 might be culprit, as an OOM related fix went into 0.17.7.

Ok, we will consider upgrading when we are ready or when we rerun the
test and see the problem again.

Steff · November 1, 2011, 10:59am

On 24 Okt., 23:32, Shay Banon kim...@gmail.com wrote:

0.17.6 might be culprit, as an OOM related fix went into 0.17.7.

We plan to do a upgrade from 0.17.6 to 0.18.2 now. Are there any info
available about whether or not you can just do a software-upgrade
between those versions without having to consider configuration or
data already in 0.17.6. Put in other words, can I just stop all nodes
in my existing 0.17.6 cluster (already containing indices with data),
upgrade the version of ES installed on those nodes, copy data folders
and elasticsearch.yml, and then start all nodes running version 0.18.2
again, or are there maybe configuration entries that have been removed
(or changed semantics), or did the "data-format" change or stuff like
that?

Regards, Per Steffensen

kimchy · November 1, 2011, 5:44pm

Yes, you can simply stop the nodes, use the new elasticsearch version, and
start it.

On Tue, Nov 1, 2011 at 12:59 PM, Steff steff@designware.dk wrote:

On 24 Okt., 23:32, Shay Banon kim...@gmail.com wrote:

0.17.6 might be culprit, as an OOM related fix went into 0.17.7.

We plan to do a upgrade from 0.17.6 to 0.18.2 now. Are there any info
available about whether or not you can just do a software-upgrade
between those versions without having to consider configuration or
data already in 0.17.6. Put in other words, can I just stop all nodes
in my existing 0.17.6 cluster (already containing indices with data),
upgrade the version of ES installed on those nodes, copy data folders
and elasticsearch.yml, and then start all nodes running version 0.18.2
again, or are there maybe configuration entries that have been removed
(or changed semantics), or did the "data-format" change or stuff like
that?

Regards, Per Steffensen

Topic		Replies	Views
Shard rebalancing after adding nodes Elasticsearch	3	634	March 10, 2019
Adding the 3rd node on 0.20.5 , (5 primary , 1 replica) did not rebalance based on count Elasticsearch	2	308	July 6, 2017
Forcing sync of replicas Elasticsearch	5	2630	July 6, 2017
Data loss with 0.19.8 Elasticsearch	3	636	July 6, 2017
Replication on data nodes Elasticsearch	6	1659	March 3, 2018

Data corruption after add/remove node to unicast cluster

Related topics