Forcing sync of replicas


(Michael Salmon) #1

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/92365c48-c08c-429f-97e2-5714e50052a3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Boaz Leskes) #2

Hi Michael,

The fix option of check_on_startup checks indices and removes the
segments that are corrupted, this is a lucene level operation and is
primarily meant to be used in extreme cases where you only had one copy of
shards and those got corrupted.

In your cases, since the primaries are good, the easiest would be to use
the reroute API to tell elasticsearch to move the replicas that have been
corrupted to another node. When moving replicas, ES actually makes a new
copy of the primary as it protects against exactly these kinds of
situations:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.html#cluster-reroute

Cheers,
Boaz

On Tuesday, June 10, 2014 9:23:56 AM UTC+2, Michael Salmon wrote:

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7d3b32ce-f86f-45d2-b17a-b3ee61db3a43%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Michael Salmon) #3

Thanks, I didn't think of moving the shards. Should have been faster as
well.

On Wednesday, 11 June 2014 09:03:44 UTC+2, Boaz Leskes wrote:

Hi Michael,

The fix option of check_on_startup checks indices and removes the
segments that are corrupted, this is a lucene level operation and is
primarily meant to be used in extreme cases where you only had one copy of
shards and those got corrupted.

In your cases, since the primaries are good, the easiest would be to use
the reroute API to tell elasticsearch to move the replicas that have been
corrupted to another node. When moving replicas, ES actually makes a new
copy of the primary as it protects against exactly these kinds of
situations:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.html#cluster-reroute

Cheers,
Boaz

On Tuesday, June 10, 2014 9:23:56 AM UTC+2, Michael Salmon wrote:

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/56613bd4-e5df-4e50-902d-d5274a2bcdfc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Michael Salmon) #4

Moving the shard was a good idea but unfortunately:

{
"error": "ElasticsearchIllegalArgumentException[[move_allocation] can't
move [ds_infrastructure-storage-na-qtree][0], shard is not started (state =
INITIALIZING]]",
"status": 400
}

Allocate didn't work either as the shard was not unallocated.

On Wednesday, 11 June 2014 09:03:44 UTC+2, Boaz Leskes wrote:

Hi Michael,

The fix option of check_on_startup checks indices and removes the
segments that are corrupted, this is a lucene level operation and is
primarily meant to be used in extreme cases where you only had one copy of
shards and those got corrupted.

In your cases, since the primaries are good, the easiest would be to use
the reroute API to tell elasticsearch to move the replicas that have been
corrupted to another node. When moving replicas, ES actually makes a new
copy of the primary as it protects against exactly these kinds of
situations:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.html#cluster-reroute

Cheers,
Boaz

On Tuesday, June 10, 2014 9:23:56 AM UTC+2, Michael Salmon wrote:

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7ae6a3cf-d51b-4d04-bbae-e7f085be9a90%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Boaz Leskes) #5

hmm. Yeah, I can now see that in the code. Another option is to use the
allocation filtering api to move the shard off the node and then cancel the
rule after it was done:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html#index-modules-allocation

On Thu, Jun 19, 2014 at 11:06 AM, Michael Salmon michael.salmon@inovia.nu
wrote:

Moving the shard was a good idea but unfortunately:

{
"error": "ElasticsearchIllegalArgumentException[[move_allocation] can't
move [ds_infrastructure-storage-na-qtree][0], shard is not started (state =
INITIALIZING]]",
"status": 400
}

Allocate didn't work either as the shard was not unallocated.

On Wednesday, 11 June 2014 09:03:44 UTC+2, Boaz Leskes wrote:

Hi Michael,

The fix option of check_on_startup checks indices and removes the
segments that are corrupted, this is a lucene level operation and is
primarily meant to be used in extreme cases where you only had one copy of
shards and those got corrupted.

In your cases, since the primaries are good, the easiest would be to use
the reroute API to tell elasticsearch to move the replicas that have been
corrupted to another node. When moving replicas, ES actually makes a new
copy of the primary as it protects against exactly these kinds of
situations: http://www.elasticsearch.org/guide/en/elasticsearch/
reference/current/cluster-reroute.html#cluster-reroute

Cheers,
Boaz

On Tuesday, June 10, 2014 9:23:56 AM UTC+2, Michael Salmon wrote:

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/-8fMHE8rTIc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7ae6a3cf-d51b-4d04-bbae-e7f085be9a90%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/7ae6a3cf-d51b-4d04-bbae-e7f085be9a90%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKzwz0rta5YfOe5J56OuLVdCNeF6Qt6TAO_To3YxmEC%2BqTpRGw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6