Cluster reroute and potential data loss


(Mark Conlin) #1

I was reading some ES docohttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.htmland stumbled upon this part of the Cluster Reroute API:

allocate:
"Allocate an unassigned shard to a node. .... It also accepts the
"allow primary" flag to explicitly specify that it is allowed to explicitly
allocate a primary shard (might result in data loss)."

Why might this result in data loss?

If I use:

POST /_cluster/reroute
{
"commands" : [ {
"cancel" :
{
"index" : "myindex", "shard" : 4, "node": "somenode",
"allow_primary":"true"
}
}
]
}

To get a node that has unallocated shards back to green, how will I know if
data loss has occured?
How/why is the data being lost?

Thanks,
Mark

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0829b11c-d18a-4f6e-9cf4-67a94fd55daa%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Nik Everett) #2

If all replicas of a particular shard are unallocated and you allow_primary
allocate one then it'll allocate empty. If a node that had some data for
that shard comes back it won't be able to use that data because the shard
has been allocated empty.

On Mon, Feb 3, 2014 at 4:42 PM, Mark Conlin mark.conlin@gmail.com wrote:

I was reading some ES docohttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.htmland stumbled upon this part of the Cluster Reroute API:

allocate:
"Allocate an unassigned shard to a node. .... It also accepts the
"allow primary" flag to explicitly specify that it is allowed to explicitly
allocate a primary shard (might result in data loss)."

Why might this result in data loss?

If I use:

POST /_cluster/reroute
{
"commands" : [ {
"cancel" :
{
"index" : "myindex", "shard" : 4, "node": "somenode",
"allow_primary":"true"
}
}
]
}

To get a node that has unallocated shards back to green, how will I know
if data loss has occured?
How/why is the data being lost?

Thanks,
Mark

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0829b11c-d18a-4f6e-9cf4-67a94fd55daa%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1e_y78fPt_N%3DUcTc4Grad_L4fVMLzT%2By3dQ232dsEfEQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Conlin) #3

So during a cluster restart sometimes we get nodes that have unallocated
shards, both the primary and replica will be unallocated.
They stay stuck in this state, leaving the cluster red.

If I force allocation, with allow_primary="true", I get a new blank shard,
all docs lost.
If I force allocation, with allow_primary="false", I get an error:

{
"error": "RemoteTransportException[[yournodename][inet[/10.1.1.1:9300]][cluster/reroute]];
nested: ElasticSearchIllegalArgumentException[[allocate] trying to allocate
a primary shard [yourindexname][4]], which is disabled]; ",
"status": 400
}

Once the cluster gets to this state, am I just out of luck on recovering
the data in these shards?

Mark

On Mon, Feb 3, 2014 at 4:57 PM, Nikolas Everett nik9000@gmail.com wrote:

If all replicas of a particular shard are unallocated and you
allow_primary allocate one then it'll allocate empty. If a node that had
some data for that shard comes back it won't be able to use that data
because the shard has been allocated empty.

On Mon, Feb 3, 2014 at 4:42 PM, Mark Conlin mark.conlin@gmail.com wrote:

I was reading some ES docohttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.htmland stumbled upon this part of the Cluster Reroute API:

allocate:
"Allocate an unassigned shard to a node. .... It also accepts the
"allow primary" flag to explicitly specify that it is allowed to explicitly
allocate a primary shard (might result in data loss)."

Why might this result in data loss?

If I use:

POST /_cluster/reroute
{
"commands" : [ {
"cancel" :
{
"index" : "myindex", "shard" : 4, "node": "somenode",
"allow_primary":"true"
}
}
]
}

To get a node that has unallocated shards back to green, how will I know
if data loss has occured?
How/why is the data being lost?

Thanks,
Mark

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0829b11c-d18a-4f6e-9cf4-67a94fd55daa%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/jeaefaiC6d8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1e_y78fPt_N%3DUcTc4Grad_L4fVMLzT%2By3dQ232dsEfEQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADfBvi9HnumkmGKe9ZOHQK3iBndbYUbR49BBtB6azhQPBns%2BOw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ankur Goel) #4

Hi ,

I am also facing the same issue , did you get it resolved ??
I am facing it while doing a ebs snapshot based recovery .

On Tuesday, 4 February 2014 03:46:04 UTC+5:30, Mark Conlin wrote:

So during a cluster restart sometimes we get nodes that have unallocated
shards, both the primary and replica will be unallocated.
They stay stuck in this state, leaving the cluster red.

If I force allocation, with allow_primary="true", I get a new blank shard,
all docs lost.
If I force allocation, with allow_primary="false", I get an error:

{
"error":
"RemoteTransportException[[yournodename][inet[/10.1.1.1:9300]][cluster/reroute]];
nested: ElasticSearchIllegalArgumentException[[allocate] trying to allocate
a primary shard [yourindexname][4]], which is disabled]; ",
"status": 400
}

Once the cluster gets to this state, am I just out of luck on recovering
the data in these shards?

Mark

On Mon, Feb 3, 2014 at 4:57 PM, Nikolas Everett <nik...@gmail.com<javascript:>

wrote:

If all replicas of a particular shard are unallocated and you
allow_primary allocate one then it'll allocate empty. If a node that had
some data for that shard comes back it won't be able to use that data
because the shard has been allocated empty.

On Mon, Feb 3, 2014 at 4:42 PM, Mark Conlin <mark....@gmail.com<javascript:>

wrote:

I was reading some ES docohttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.htmland stumbled upon this part of the Cluster Reroute API:

allocate:
"Allocate an unassigned shard to a node. .... It also accepts the
"allow primary" flag to explicitly specify that it is allowed to explicitly
allocate a primary shard (might result in data loss)."

Why might this result in data loss?

If I use:

POST /_cluster/reroute
{
"commands" : [ {
"cancel" :
{
"index" : "myindex", "shard" : 4, "node": "somenode",
"allow_primary":"true"
}
}
]
}

To get a node that has unallocated shards back to green, how will I know
if data loss has occured?
How/why is the data being lost?

Thanks,
Mark

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0829b11c-d18a-4f6e-9cf4-67a94fd55daa%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/jeaefaiC6d8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1e_y78fPt_N%3DUcTc4Grad_L4fVMLzT%2By3dQ232dsEfEQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aa1e3bfc-bf61-4f92-b12d-9b8ab414834e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5