Non-recovering index shard

Pat_Christopher · August 10, 2011, 5:36pm

Hey guys,
After Mondays AWS-EC2 failbasket I ended up with splitbrain and a number of
rather strange configurations. I've cleaned out those with a judicious use
of kill and restarting the ES nodes.

My index has 5 shards and one replica on six data nodes. After my kill
fiesta all five shards were yellow or red. Four of them have come back to
green and are accepting writes again. The fifth shard has stubbornly
remained at yellow even after closing and opening the index. It claims to
have one active shard and one initializing shard. Its been initializing for
about 20 hours now and I don't think its going to finish. When the other
shards were initializing there was a tremendous amount of disk activity, now
there is nothing spectacular going on.

how can I kick the last initializing shard to work? will an increase in
shards cause it to rebalance and fix itself or will that only cause more
problems?
two of the six data nodes have no data on them. I'm not entirely what
they're doing but I'd like to get them involved. Any suggestions on how to
push indicies onto them? Possibly at the same time as fixing the busted
shard?

Thanks,
Pat

Pat_Christopher · August 10, 2011, 10:33pm

I've increased the number of replicas from 1 to 2. This has caused the data
to be spread out over all nodes. I did some more research and found that
no, I can't change the number of shards for an index after it was created.

However, one shard still has a replica which is stuck in initializing. Any
idea how I can get ES to abandon that replica and try again someplace else?
Or if turn the number of replicas down from 2 to 1 will ES kill the replica
that is still initializing and it will be purged?

Pat

kimchy · August 11, 2011, 9:25am

two of the six data nodes have no data on them. I'm not entirely what
they're doing but I'd like to get them involved. Any suggestions on how to
push indicies onto them? Possibly at the same time as fixing the busted
shard?

The reason is that rebalancing will not start until the cluster is green in
order to reduce the number of relocations.

Regarding the stuck shard that is stuck initializing, thats strange... . You
have several options here, the simplest would be to bring down the node the
replica shard is initializing on, and then start it back up. It should kick
the recovery back. Another option is to reduce the replicas to 1, but it
won't necessarily remove that initializing shard.

Btw, which version are you using?

On Thu, Aug 11, 2011 at 1:33 AM, Pat Christopher <
pat.christopher.hp@gmail.com> wrote:

I've increased the number of replicas from 1 to 2. This has caused the
data to be spread out over all nodes. I did some more research and found
that no, I can't change the number of shards for an index after it was
created.

However, one shard still has a replica which is stuck in initializing. Any
idea how I can get ES to abandon that replica and try again someplace else?
Or if turn the number of replicas down from 2 to 1 will ES kill the replica
that is still initializing and it will be purged?

Pat

Pat_Christopher · August 11, 2011, 5:56pm

rebalancing only when green: makes sense. thanks.

There was something wrong with that node as a whole, it had two permanently
initializing shards after the replica increase. I've shut it down and the
cluster has gone green. It had this message over and over in the log file:

[Beetle] master should not receive new cluster state from [[O'Meggan,

Alfie]

The bad node is Beetle and the master for the cluster is O'Meggan, Alfie.
Any idea what would cause this?

I'm using 0.17.2

Pat

Topic		Replies	Views
ES Ate My Shards/Indexes Elasticsearch	13	532	July 6, 2017
Shards stuck in Initializing mode Elasticsearch	3	3972	July 5, 2017
Replica shards stuck in Initialization phase Elasticsearch	5	2899	July 5, 2017
One node cluster stuck on initializing_shards Elasticsearch	4	7521	June 29, 2017
Shards Initializing Indefinitely? Elasticsearch	10	5059	October 24, 2017

Non-recovering index shard

Related topics