Refusal to recover after node rebuild

Hi,

I've got a proof of concept cluster with 5 nodes. Several months rsyslog
data is in there with 2 replicas per index.

I then decided to rebuilt 2 nodes simultaneously. No problem. Cluster
reallocated as expected and each of the remaining 3 nodes stored all of the
indexes and replicas in full. Once the cluster had finished this
reallocation, I decided to rebuild another 2 nodes simultaneously (without
waiting for the first 2 to come back). This, after all, would leave 1 node
storing all of the data.

Unfortunately that's where things start to unravel. The initial 2 nodes
have come back online and joined the cluster. But not my cluster reports
that every shard is unassigned and there doesn't seem to be any process
running to reallocate.

What I don't understand is that the cluster was fully balanced at 3 nodes
and 2 replicas per index. Does taking a node out in this instance cause a
problem? My data is still sitting on the node that hasn't been rebuilt,
but I can't get it to reallocate onto the other nodes.

It's only a proof of concept, so data loss isn't the issue here. It's
understanding why this happened and figuring out if I did anything
inherently wrong.

Cheers

Duncan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8a301f07-1634-4b8c-93c7-5a84f45b534e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.