Data loss in Elasticsearch 1.4.2

Aadithya_C · March 1, 2015, 7:58am

Hi,
We can consistently see data loss in Elasticsearch in the following
scenario-

Create a 2 node cluster with primary and replica.
Add 2 more nodes and keep uploading documents when the 2 nodes are
coming up.
Set cluster.routing.allocation.exclude._ip on the first 2 nodes so
that all shards relocate.
As soon as the shard status on the new nodes is STARTED terminate the
first 2 nodes( both primary and replica).

Now intermittently we can see documents uploaded while shard relocation was
in progress, missing. My question is, do we copy the transaction log in
addition to the index during shard relocation? My current guess is that the
transaction log is not copied and documents which are not yet flushed are
lost. Can someone confirm that this hypothesis is correct?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dac11750-1f4b-404d-8098-bbb649f62dc3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Aadithya_C · March 1, 2015, 2:24pm

Looks like shard recovery has 5 stages one of which is translog. So now the
question becomes what is the correct moment to shutdown the old nodes. Is
terminating old nodes after the shard status on new ones is STARTED
incorrect? Also, cluster health is green. Just the documents go missing.

On Sunday, March 1, 2015 at 1:28:29 PM UTC+5:30, Aadithya C wrote:

Hi,
We can consistently see data loss in Elasticsearch in the following
scenario-

Create a 2 node cluster with primary and replica.

Add 2 more nodes and keep uploading documents when the 2 nodes are
coming up.

Set cluster.routing.allocation.exclude._ip on the first 2 nodes so
that all shards relocate.

As soon as the shard status on the new nodes is STARTED terminate
the first 2 nodes( both primary and replica).

Now intermittently we can see documents uploaded while shard relocation
was in progress, missing. My question is, do we copy the transaction log in
addition to the index during shard relocation? My current guess is that the
transaction log is not copied and documents which are not yet flushed are
lost. Can someone confirm that this hypothesis is correct?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0a19d791-2f41-4e2f-b673-c0d922b8400b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dennis · March 1, 2015, 9:14pm

I thought the whole idea of the cluster is that ANYTHING could be shut down
at ANY TIME and with enough replica shards and a quorum maintained at all
times, everything would be OK. So, does this not work with a cluster
smaller than 5 machines/nodes?

On Sunday, March 1, 2015 at 6:24:19 AM UTC-8, Aadithya C wrote:

Looks like shard recovery has 5 stages one of which is translog. So now
the question becomes what is the correct moment to shutdown the old nodes.
Is terminating old nodes after the shard status on new ones is STARTED
incorrect? Also, cluster health is green. Just the documents go missing.

On Sunday, March 1, 2015 at 1:28:29 PM UTC+5:30, Aadithya C wrote:

Hi,
We can consistently see data loss in Elasticsearch in the following
scenario-

Create a 2 node cluster with primary and replica.

Add 2 more nodes and keep uploading documents when the 2 nodes are
coming up.

Set cluster.routing.allocation.exclude._ip on the first 2 nodes so
that all shards relocate.

As soon as the shard status on the new nodes is STARTED terminate
the first 2 nodes( both primary and replica).

Now intermittently we can see documents uploaded while shard relocation
was in progress, missing. My question is, do we copy the transaction log in
addition to the index during shard relocation? My current guess is that the
transaction log is not copied and documents which are not yet flushed are
lost. Can someone confirm that this hypothesis is correct?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/41ad534a-819f-4ec0-99ca-2cc975128272%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Boaz_Leskes · March 2, 2015, 7:23am

Aadithya, Dennis,

This should work and may indicate a bug. To answer your question about the
use of translog: recovery has 3 main phases - the first is copying over
files (if needed, we try to reuse local files). This is the longest phases.
The second phases is to replay all operations done since the beginning of
phase 1 on the source shard. This is done from the translog. The third and
last is to make sure that all operations done during the second phase are
also replayed (this is a safety measure as we start replicating these
operation as soon as phase1 is over).

Aadithya - can you open a github issue with exact reproduction? preferably
code. These things can be very tricky and the devil is in the details.

Cheers,
Boaz

On Sunday, March 1, 2015 at 10:14:13 PM UTC+1, Dennis wrote:

I thought the whole idea of the cluster is that ANYTHING could be shut
down at ANY TIME and with enough replica shards and a quorum maintained at
all times, everything would be OK. So, does this not work with a cluster
smaller than 5 machines/nodes?

On Sunday, March 1, 2015 at 6:24:19 AM UTC-8, Aadithya C wrote:

Looks like shard recovery has 5 stages one of which is translog. So now
the question becomes what is the correct moment to shutdown the old nodes.
Is terminating old nodes after the shard status on new ones is STARTED
incorrect? Also, cluster health is green. Just the documents go missing.

On Sunday, March 1, 2015 at 1:28:29 PM UTC+5:30, Aadithya C wrote:

Hi,
We can consistently see data loss in Elasticsearch in the following
scenario-

Create a 2 node cluster with primary and replica.

Add 2 more nodes and keep uploading documents when the 2 nodes are
coming up.

Set cluster.routing.allocation.exclude._ip on the first 2 nodes so
that all shards relocate.

As soon as the shard status on the new nodes is STARTED terminate
the first 2 nodes( both primary and replica).

Now intermittently we can see documents uploaded while shard relocation
was in progress, missing. My question is, do we copy the transaction log in
addition to the index during shard relocation? My current guess is that the
transaction log is not copied and documents which are not yet flushed are
lost. Can someone confirm that this hypothesis is correct?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ee7c88de-a7ab-4fda-99c6-8cccecbdb14a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Data loss after servers hosting the Primary shard and Replica shard were rebooted at the same time Elasticsearch	1	344	July 6, 2017
Fundamental question about ES data/shards Elasticsearch	3	417	July 6, 2017
Detecting Data Loss Elasticsearch	6	1613	January 21, 2018
Old shards not deleted upon relocation Elasticsearch	8	4299	February 22, 2017
Disappearing Shards Elasticsearch	10	377	July 6, 2017

Data loss in Elasticsearch 1.4.2

Related topics