Best effort inconsistent recovery


(Sebastian Gavarini) #1

I am evaluating a couple of bad shutdown and recovery scenarios. I
indexed some documents in two indices in two ES instances, with no
replicas, so each instance holds a unique copy/shard/index.
ES1 -> Index1 -> {doc1,doc2,...,doc10}
ES2 -> Index2 -> {doc20,doc21,...,doc30}

If I killed both instances without a graceful shutdown, and then bring
back only one, when queried through the API it throws a cluster state
exception preventing any operation, so far so good.

I would like to know if there are some configuration options that
allow me to bring up the cluster again, eventually losing some
indices, in a best effort to recover, but as a trade off with
downtime. Maybe a parameter "bring back whatever you can find after 5
min", or a cluster unblock command. Then I would manually schedule a
full index rebuild from DB storage, but with almost no downtime in the
live site.

To further elaborate, I know the idea is to have everything
replicated, so this case shouldn't happen, but if it does, for
whatever unplanned case or error, I'd like to be able to accept some
lost data and not the full cluster down and blocked and have to
manually delete all the indices and do a full index rebuild. Can that
be done?

Thanks,
Sebastian.


(Shay Banon) #2

Whats your configuration for each node?

On Thu, Sep 30, 2010 at 7:34 AM, Sebastian sgavarini@gmail.com wrote:

I am evaluating a couple of bad shutdown and recovery scenarios. I
indexed some documents in two indices in two ES instances, with no
replicas, so each instance holds a unique copy/shard/index.
ES1 -> Index1 -> {doc1,doc2,...,doc10}
ES2 -> Index2 -> {doc20,doc21,...,doc30}

If I killed both instances without a graceful shutdown, and then bring
back only one, when queried through the API it throws a cluster state
exception preventing any operation, so far so good.

I would like to know if there are some configuration options that
allow me to bring up the cluster again, eventually losing some
indices, in a best effort to recover, but as a trade off with
downtime. Maybe a parameter "bring back whatever you can find after 5
min", or a cluster unblock command. Then I would manually schedule a
full index rebuild from DB storage, but with almost no downtime in the
live site.

To further elaborate, I know the idea is to have everything
replicated, so this case shouldn't happen, but if it does, for
whatever unplanned case or error, I'd like to be able to accept some
lost data and not the full cluster down and blocked and have to
manually delete all the indices and do a full index rebuild. Can that
be done?

Thanks,
Sebastian.


(Sebastian Gavarini) #3

I did a couple of tests, and I found two different cases.

The common parameter used:
recover_after_nodes: 1

  1. If I set replicas to 1, added the documents to two different
    indices, in two nodes (ES1->index1-2, ES2->index1-2), kill both, pick
    ES1 for example and delete it's index1, bring back only ES1,
    it automatically created back and empty index1, and index2 is fine
    with its data. If I bring back ES2, it copies the empty index1 from
    the master ES1, overwriting it's own index1, which is expected.

  2. If I set replicas to 0, added the documents to two different
    indices, in two nodes (ES1->index1, ES2->index2) so each goes to a
    different node, kill both, bring back one, for example ES1 which
    contains only index1, I get:
    "error" : "ClusterBlockException[blocked by: [3/index not recovered
    (not enough nodes with shards allocated found)];]"
    So it doesn't try to recreate index2 as in the previous case. Is this
    the default behaviour? if replicas are 0, then don't recreate it?

Thanks,
Sebastian.

On Sep 30, 2:49 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Whats your configuration for each node?

On Thu, Sep 30, 2010 at 7:34 AM, Sebastian sgavar...@gmail.com wrote:

I am evaluating a couple of bad shutdown and recovery scenarios. I
indexed some documents in two indices in two ES instances, with no
replicas, so each instance holds a unique copy/shard/index.
ES1 -> Index1 -> {doc1,doc2,...,doc10}
ES2 -> Index2 -> {doc20,doc21,...,doc30}

If I killed both instances without a graceful shutdown, and then bring
back only one, when queried through the API it throws a cluster state
exception preventing any operation, so far so good.

I would like to know if there are some configuration options that
allow me to bring up the cluster again, eventually losing some
indices, in a best effort to recover, but as a trade off with
downtime. Maybe a parameter "bring back whatever you can find after 5
min", or a cluster unblock command. Then I would manually schedule a
full index rebuild from DB storage, but with almost no downtime in the
live site.

To further elaborate, I know the idea is to have everything
replicated, so this case shouldn't happen, but if it does, for
whatever unplanned case or error, I'd like to be able to accept some
lost data and not the full cluster down and blocked and have to
manually delete all the indices and do a full index rebuild. Can that
be done?

Thanks,
Sebastian.


(Shay Banon) #4

I don't really understand your tests to be honest. Why do you delete the
actual index from the relevant machine? It does not make sense. What are you
trying to simulate?

There is no specific index allocation to a specific node, indices are
created on all nodes, but shards area allocated between the different
nodes. So something like: ES1-index1 is not really meaningful.

In case of no replicas, a specific index will be blocked until, between all
nodes, at least one copy of each shard can be found. Once it is found, it
will be recovered.

-shay.banon

On Thu, Sep 30, 2010 at 8:13 AM, Sebastian sgavarini@gmail.com wrote:

I did a couple of tests, and I found two different cases.

The common parameter used:
recover_after_nodes: 1

  1. If I set replicas to 1, added the documents to two different
    indices, in two nodes (ES1->index1-2, ES2->index1-2), kill both, pick
    ES1 for example and delete it's index1, bring back only ES1,
    it automatically created back and empty index1, and index2 is fine
    with its data. If I bring back ES2, it copies the empty index1 from
    the master ES1, overwriting it's own index1, which is expected.

  2. If I set replicas to 0, added the documents to two different
    indices, in two nodes (ES1->index1, ES2->index2) so each goes to a
    different node, kill both, bring back one, for example ES1 which
    contains only index1, I get:
    "error" : "ClusterBlockException[blocked by: [3/index not recovered
    (not enough nodes with shards allocated found)];]"
    So it doesn't try to recreate index2 as in the previous case. Is this
    the default behaviour? if replicas are 0, then don't recreate it?

Thanks,
Sebastian.

On Sep 30, 2:49 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Whats your configuration for each node?

On Thu, Sep 30, 2010 at 7:34 AM, Sebastian sgavar...@gmail.com wrote:

I am evaluating a couple of bad shutdown and recovery scenarios. I
indexed some documents in two indices in two ES instances, with no
replicas, so each instance holds a unique copy/shard/index.
ES1 -> Index1 -> {doc1,doc2,...,doc10}
ES2 -> Index2 -> {doc20,doc21,...,doc30}

If I killed both instances without a graceful shutdown, and then bring
back only one, when queried through the API it throws a cluster state
exception preventing any operation, so far so good.

I would like to know if there are some configuration options that
allow me to bring up the cluster again, eventually losing some
indices, in a best effort to recover, but as a trade off with
downtime. Maybe a parameter "bring back whatever you can find after 5
min", or a cluster unblock command. Then I would manually schedule a
full index rebuild from DB storage, but with almost no downtime in the
live site.

To further elaborate, I know the idea is to have everything
replicated, so this case shouldn't happen, but if it does, for
whatever unplanned case or error, I'd like to be able to accept some
lost data and not the full cluster down and blocked and have to
manually delete all the indices and do a full index rebuild. Can that
be done?

Thanks,
Sebastian.


(Sebastian Gavarini) #5

Probably my explanation wasn't very extensive, sorry, I'll try to
clarify.

What I am trying to simulate is an unrecoverable missing index, (in my
case each index uses only one shard).

The test I did consists in adding for example 8 documents, 4 to
index1, and 4 to index2, with two instances up and running, ES1, and
ES2.
I don't know the internals, but index1 ends up in one ES, lets say
ES1, and index2 in ES2. If replicas are 0, then you have one index (of
one shard) in each ES node. That's case 2) in my other post.

In case 1), as the replicas were set to 1, I had in each of the two ES
nodes a copy of index1 and index2. So when I killed both nodes, then
bring back say ES1, it already contained index1 and index2. That's why
I had to manually delete index1 in ES1, in which case it recreated it
empty, and brought back index2 intact, and the whole (single node)
cluster was up and running for work.

Maybe my test wasn't very happy, but I hope you get the picture of
what I am trying to do, I would like to know if there is a command/
parameter/etc to call as a last resource, like I said in my first
post, "bring back whatever you can find after 5 min", so the cluster
won't be blocked. Sometimes, it's more affordable to lose some data
than to lose the full cluster for preventing missing data.

I was thinking something like: "recover_unlock_cluster: 5m"
gateway:
type: local
recover_after_time: 2m
recover_after_nodes: 2
recover_unlock_cluster: 5m

Do you think it is possible?
Thanks,
Sebastian.

On Sep 30, 3:39 am, Shay Banon shay.ba...@elasticsearch.com wrote:

I don't really understand your tests to be honest. Why do you delete the
actual index from the relevant machine? It does not make sense. What are you
trying to simulate?

There is no specific index allocation to a specific node, indices are
created on all nodes, but shards area allocated between the different
nodes. So something like: ES1-index1 is not really meaningful.

In case of no replicas, a specific index will be blocked until, between all
nodes, at least one copy of each shard can be found. Once it is found, it
will be recovered.

-shay.banon

On Thu, Sep 30, 2010 at 8:13 AM, Sebastian sgavar...@gmail.com wrote:

I did a couple of tests, and I found two different cases.

The common parameter used:
recover_after_nodes: 1

  1. If I set replicas to 1, added the documents to two different
    indices, in two nodes (ES1->index1-2, ES2->index1-2), kill both, pick
    ES1 for example and delete it's index1, bring back only ES1,
    it automatically created back and empty index1, and index2 is fine
    with its data. If I bring back ES2, it copies the empty index1 from
    the master ES1, overwriting it's own index1, which is expected.
  1. If I set replicas to 0, added the documents to two different
    indices, in two nodes (ES1->index1, ES2->index2) so each goes to a
    different node, kill both, bring back one, for example ES1 which
    contains only index1, I get:
    "error" : "ClusterBlockException[blocked by: [3/index not recovered
    (not enough nodes with shards allocated found)];]"
    So it doesn't try to recreate index2 as in the previous case. Is this
    the default behaviour? if replicas are 0, then don't recreate it?

Thanks,
Sebastian.

On Sep 30, 2:49 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Whats your configuration for each node?

On Thu, Sep 30, 2010 at 7:34 AM, Sebastian sgavar...@gmail.com wrote:

I am evaluating a couple of bad shutdown and recovery scenarios. I
indexed some documents in two indices in two ES instances, with no
replicas, so each instance holds a unique copy/shard/index.
ES1 -> Index1 -> {doc1,doc2,...,doc10}
ES2 -> Index2 -> {doc20,doc21,...,doc30}

If I killed both instances without a graceful shutdown, and then bring
back only one, when queried through the API it throws a cluster state
exception preventing any operation, so far so good.

I would like to know if there are some configuration options that
allow me to bring up the cluster again, eventually losing some
indices, in a best effort to recover, but as a trade off with
downtime. Maybe a parameter "bring back whatever you can find after 5
min", or a cluster unblock command. Then I would manually schedule a
full index rebuild from DB storage, but with almost no downtime in the
live site.

To further elaborate, I know the idea is to have everything
replicated, so this case shouldn't happen, but if it does, for
whatever unplanned case or error, I'd like to be able to accept some
lost data and not the full cluster down and blocked and have to
manually delete all the indices and do a full index rebuild. Can that
be done?

Thanks,
Sebastian.


(Shay Banon) #6

If a specific index can't recover all its shard, then its blocked.
Recovering just part of the shards does not make sense, since you won't be
able to tell what was recovered and what not. You can decide that if an
index is blocked (exposed in the cluster state API), then you can delete it
and reindex the relevant data.

-shay.banon

On Thu, Sep 30, 2010 at 9:01 AM, Sebastian sgavarini@gmail.com wrote:

Probably my explanation wasn't very extensive, sorry, I'll try to
clarify.

What I am trying to simulate is an unrecoverable missing index, (in my
case each index uses only one shard).

The test I did consists in adding for example 8 documents, 4 to
index1, and 4 to index2, with two instances up and running, ES1, and
ES2.
I don't know the internals, but index1 ends up in one ES, lets say
ES1, and index2 in ES2. If replicas are 0, then you have one index (of
one shard) in each ES node. That's case 2) in my other post.

In case 1), as the replicas were set to 1, I had in each of the two ES
nodes a copy of index1 and index2. So when I killed both nodes, then
bring back say ES1, it already contained index1 and index2. That's why
I had to manually delete index1 in ES1, in which case it recreated it
empty, and brought back index2 intact, and the whole (single node)
cluster was up and running for work.

Maybe my test wasn't very happy, but I hope you get the picture of
what I am trying to do, I would like to know if there is a command/
parameter/etc to call as a last resource, like I said in my first
post, "bring back whatever you can find after 5 min", so the cluster
won't be blocked. Sometimes, it's more affordable to lose some data
than to lose the full cluster for preventing missing data.

I was thinking something like: "recover_unlock_cluster: 5m"
gateway:
type: local
recover_after_time: 2m
recover_after_nodes: 2
recover_unlock_cluster: 5m

Do you think it is possible?
Thanks,
Sebastian.

On Sep 30, 3:39 am, Shay Banon shay.ba...@elasticsearch.com wrote:

I don't really understand your tests to be honest. Why do you delete the
actual index from the relevant machine? It does not make sense. What are
you
trying to simulate?

There is no specific index allocation to a specific node, indices are
created on all nodes, but shards area allocated between the different
nodes. So something like: ES1-index1 is not really meaningful.

In case of no replicas, a specific index will be blocked until, between
all
nodes, at least one copy of each shard can be found. Once it is found, it
will be recovered.

-shay.banon

On Thu, Sep 30, 2010 at 8:13 AM, Sebastian sgavar...@gmail.com wrote:

I did a couple of tests, and I found two different cases.

The common parameter used:
recover_after_nodes: 1

  1. If I set replicas to 1, added the documents to two different
    indices, in two nodes (ES1->index1-2, ES2->index1-2), kill both, pick
    ES1 for example and delete it's index1, bring back only ES1,
    it automatically created back and empty index1, and index2 is fine
    with its data. If I bring back ES2, it copies the empty index1 from
    the master ES1, overwriting it's own index1, which is expected.
  1. If I set replicas to 0, added the documents to two different
    indices, in two nodes (ES1->index1, ES2->index2) so each goes to a
    different node, kill both, bring back one, for example ES1 which
    contains only index1, I get:
    "error" : "ClusterBlockException[blocked by: [3/index not recovered
    (not enough nodes with shards allocated found)];]"
    So it doesn't try to recreate index2 as in the previous case. Is this
    the default behaviour? if replicas are 0, then don't recreate it?

Thanks,
Sebastian.

On Sep 30, 2:49 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Whats your configuration for each node?

On Thu, Sep 30, 2010 at 7:34 AM, Sebastian sgavar...@gmail.com
wrote:

I am evaluating a couple of bad shutdown and recovery scenarios. I
indexed some documents in two indices in two ES instances, with no
replicas, so each instance holds a unique copy/shard/index.
ES1 -> Index1 -> {doc1,doc2,...,doc10}
ES2 -> Index2 -> {doc20,doc21,...,doc30}

If I killed both instances without a graceful shutdown, and then
bring

back only one, when queried through the API it throws a cluster
state

exception preventing any operation, so far so good.

I would like to know if there are some configuration options that
allow me to bring up the cluster again, eventually losing some
indices, in a best effort to recover, but as a trade off with
downtime. Maybe a parameter "bring back whatever you can find after
5

min", or a cluster unblock command. Then I would manually schedule
a

full index rebuild from DB storage, but with almost no downtime in
the

live site.

To further elaborate, I know the idea is to have everything
replicated, so this case shouldn't happen, but if it does, for
whatever unplanned case or error, I'd like to be able to accept
some

lost data and not the full cluster down and blocked and have to
manually delete all the indices and do a full index rebuild. Can
that

be done?

Thanks,
Sebastian.


(Sebastian Gavarini) #7

Ok, I'll try that then, thanks!

On Sep 30, 4:05 am, Shay Banon shay.ba...@elasticsearch.com wrote:

If a specific index can't recover all its shard, then its blocked.
Recovering just part of the shards does not make sense, since you won't be
able to tell what was recovered and what not. You can decide that if an
index is blocked (exposed in the cluster state API), then you can delete it
and reindex the relevant data.

-shay.banon

On Thu, Sep 30, 2010 at 9:01 AM, Sebastian sgavar...@gmail.com wrote:

Probably my explanation wasn't very extensive, sorry, I'll try to
clarify.

What I am trying to simulate is an unrecoverable missing index, (in my
case each index uses only one shard).

The test I did consists in adding for example 8 documents, 4 to
index1, and 4 to index2, with two instances up and running, ES1, and
ES2.
I don't know the internals, but index1 ends up in one ES, lets say
ES1, and index2 in ES2. If replicas are 0, then you have one index (of
one shard) in each ES node. That's case 2) in my other post.

In case 1), as the replicas were set to 1, I had in each of the two ES
nodes a copy of index1 and index2. So when I killed both nodes, then
bring back say ES1, it already contained index1 and index2. That's why
I had to manually delete index1 in ES1, in which case it recreated it
empty, and brought back index2 intact, and the whole (single node)
cluster was up and running for work.

Maybe my test wasn't very happy, but I hope you get the picture of
what I am trying to do, I would like to know if there is a command/
parameter/etc to call as a last resource, like I said in my first
post, "bring back whatever you can find after 5 min", so the cluster
won't be blocked. Sometimes, it's more affordable to lose some data
than to lose the full cluster for preventing missing data.

I was thinking something like: "recover_unlock_cluster: 5m"
gateway:
type: local
recover_after_time: 2m
recover_after_nodes: 2
recover_unlock_cluster: 5m

Do you think it is possible?
Thanks,
Sebastian.

On Sep 30, 3:39 am, Shay Banon shay.ba...@elasticsearch.com wrote:

I don't really understand your tests to be honest. Why do you delete the
actual index from the relevant machine? It does not make sense. What are
you
trying to simulate?

There is no specific index allocation to a specific node, indices are
created on all nodes, but shards area allocated between the different
nodes. So something like: ES1-index1 is not really meaningful.

In case of no replicas, a specific index will be blocked until, between
all
nodes, at least one copy of each shard can be found. Once it is found, it
will be recovered.

-shay.banon

On Thu, Sep 30, 2010 at 8:13 AM, Sebastian sgavar...@gmail.com wrote:

I did a couple of tests, and I found two different cases.

The common parameter used:
recover_after_nodes: 1

  1. If I set replicas to 1, added the documents to two different
    indices, in two nodes (ES1->index1-2, ES2->index1-2), kill both, pick
    ES1 for example and delete it's index1, bring back only ES1,
    it automatically created back and empty index1, and index2 is fine
    with its data. If I bring back ES2, it copies the empty index1 from
    the master ES1, overwriting it's own index1, which is expected.
  1. If I set replicas to 0, added the documents to two different
    indices, in two nodes (ES1->index1, ES2->index2) so each goes to a
    different node, kill both, bring back one, for example ES1 which
    contains only index1, I get:
    "error" : "ClusterBlockException[blocked by: [3/index not recovered
    (not enough nodes with shards allocated found)];]"
    So it doesn't try to recreate index2 as in the previous case. Is this
    the default behaviour? if replicas are 0, then don't recreate it?

Thanks,
Sebastian.

On Sep 30, 2:49 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Whats your configuration for each node?

On Thu, Sep 30, 2010 at 7:34 AM, Sebastian sgavar...@gmail.com
wrote:

I am evaluating a couple of bad shutdown and recovery scenarios. I
indexed some documents in two indices in two ES instances, with no
replicas, so each instance holds a unique copy/shard/index.
ES1 -> Index1 -> {doc1,doc2,...,doc10}
ES2 -> Index2 -> {doc20,doc21,...,doc30}

If I killed both instances without a graceful shutdown, and then
bring

back only one, when queried through the API it throws a cluster
state

exception preventing any operation, so far so good.

I would like to know if there are some configuration options that
allow me to bring up the cluster again, eventually losing some
indices, in a best effort to recover, but as a trade off with
downtime. Maybe a parameter "bring back whatever you can find after
5

min", or a cluster unblock command. Then I would manually schedule
a

full index rebuild from DB storage, but with almost no downtime in
the

live site.

To further elaborate, I know the idea is to have everything
replicated, so this case shouldn't happen, but if it does, for
whatever unplanned case or error, I'd like to be able to accept
some

lost data and not the full cluster down and blocked and have to
manually delete all the indices and do a full index rebuild. Can
that

be done?

Thanks,
Sebastian.


(system) #8