Forcing sync of replicas

Michael_Salmon · June 10, 2014, 7:23am

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/92365c48-c08c-429f-97e2-5714e50052a3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Boaz_Leskes · June 11, 2014, 7:03am

Hi Michael,

The fix option of check_on_startup checks indices and removes the
segments that are corrupted, this is a lucene level operation and is
primarily meant to be used in extreme cases where you only had one copy of
shards and those got corrupted.

In your cases, since the primaries are good, the easiest would be to use
the reroute API to tell elasticsearch to move the replicas that have been
corrupted to another node. When moving replicas, ES actually makes a new
copy of the primary as it protects against exactly these kinds of
situations:

Cheers,
Boaz

On Tuesday, June 10, 2014 9:23:56 AM UTC+2, Michael Salmon wrote:

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7d3b32ce-f86f-45d2-b17a-b3ee61db3a43%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael_Salmon · June 11, 2014, 9:51am

Thanks, I didn't think of moving the shards. Should have been faster as
well.

On Wednesday, 11 June 2014 09:03:44 UTC+2, Boaz Leskes wrote:

Hi Michael,

The fix option of check_on_startup checks indices and removes the
segments that are corrupted, this is a lucene level operation and is
primarily meant to be used in extreme cases where you only had one copy of
shards and those got corrupted.

In your cases, since the primaries are good, the easiest would be to use
the reroute API to tell elasticsearch to move the replicas that have been
corrupted to another node. When moving replicas, ES actually makes a new
copy of the primary as it protects against exactly these kinds of
situations:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Cheers,
Boaz

On Tuesday, June 10, 2014 9:23:56 AM UTC+2, Michael Salmon wrote:

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/56613bd4-e5df-4e50-902d-d5274a2bcdfc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael_Salmon · June 19, 2014, 9:06am

Moving the shard was a good idea but unfortunately:

{
"error": "ElasticsearchIllegalArgumentException[[move_allocation] can't
move [ds_infrastructure-storage-na-qtree][0], shard is not started (state =
INITIALIZING]]",
"status": 400
}

Allocate didn't work either as the shard was not unallocated.

On Wednesday, 11 June 2014 09:03:44 UTC+2, Boaz Leskes wrote:

Hi Michael,

The fix option of check_on_startup checks indices and removes the
segments that are corrupted, this is a lucene level operation and is
primarily meant to be used in extreme cases where you only had one copy of
shards and those got corrupted.

In your cases, since the primaries are good, the easiest would be to use
the reroute API to tell elasticsearch to move the replicas that have been
corrupted to another node. When moving replicas, ES actually makes a new
copy of the primary as it protects against exactly these kinds of
situations:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Cheers,
Boaz

On Tuesday, June 10, 2014 9:23:56 AM UTC+2, Michael Salmon wrote:

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7ae6a3cf-d51b-4d04-bbae-e7f085be9a90%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Boaz_Leskes · June 22, 2014, 7:59pm

hmm. Yeah, I can now see that in the code. Another option is to use the
allocation filtering api to move the shard off the node and then cancel the
rule after it was done:

On Thu, Jun 19, 2014 at 11:06 AM, Michael Salmon michael.salmon@inovia.nu
wrote:

Moving the shard was a good idea but unfortunately:

{
"error": "ElasticsearchIllegalArgumentException[[move_allocation] can't
move [ds_infrastructure-storage-na-qtree][0], shard is not started (state =
INITIALIZING]]",
"status": 400
}

Allocate didn't work either as the shard was not unallocated.

On Wednesday, 11 June 2014 09:03:44 UTC+2, Boaz Leskes wrote:

Hi Michael,

The fix option of check_on_startup checks indices and removes the
segments that are corrupted, this is a lucene level operation and is
primarily meant to be used in extreme cases where you only had one copy of
shards and those got corrupted.

In your cases, since the primaries are good, the easiest would be to use
the reroute API to tell elasticsearch to move the replicas that have been
corrupted to another node. When moving replicas, ES actually makes a new
copy of the primary as it protects against exactly these kinds of
situations: Elasticsearch Platform — Find real-time answers at scale | Elastic
reference/current/cluster-reroute.html#cluster-reroute

Cheers,
Boaz

On Tuesday, June 10, 2014 9:23:56 AM UTC+2, Michael Salmon wrote:

I had a problem with corrupted shards so I restarted my cluster with
"index.shard.check_on_startup: fix" and the corrupted shards were fixed
(i.e. deleted). Unfortunately the replicas and primaries then had differing
numbers of documents despite them all being green. Fortunately the
primaries always had more than the replicas so that I hopefully haven't
lost anything.

To fix this I set the number of replicas to 0 then 1 on all the indices
that had mismatches. Is there a better technique? I really didn't like
having just one copy of my data even if it was for a short time.

I am still running 1.1.1, is this addressed by a later release?

/Michael

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/-8fMHE8rTIc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7ae6a3cf-d51b-4d04-bbae-e7f085be9a90%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/7ae6a3cf-d51b-4d04-bbae-e7f085be9a90%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKzwz0rta5YfOe5J56OuLVdCNeF6Qt6TAO_To3YxmEC%2BqTpRGw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Replicas out of sync Elasticsearch	19	4777	February 28, 2018
Upgrade from 1.2.1 to 1.4.2 and indices/shards corrupted Elasticsearch	1	359	July 6, 2017
ES 5.X - Primary and replica shards not in sync Elasticsearch	8	5379	November 1, 2017
Corrupt primary shard, how to recover from replica shard? Elasticsearch	3	186	March 4, 2024
Primary vs. replica shard inconsistencies? Elasticsearch	8	1054	July 6, 2017

Forcing sync of replicas

Related topics