I am confused, in the first post you said that you had
one unassigned primary and in the second post you are saying that it's one
unassigned replica. Which one is correct? I will assume that it's
unassigned replica, since it makes more sense.
The "failed to merge" means that your lucene index got corrupted, but I
don't see how this would cause problems with shards allocation.
Did you set any "cluster.routing.allocation." or
"index.routing.allocation." settings? If you did, we might need to check
these settings and clean them appropriately first. If not, the safest
solution here is to just restart the cluster. If full cluster restart is
not an option and the index with unallocated shard has a small number of
shards (ideally 1), you can try setting the number of replicas on this
index to 0 and then back to 1. It will trigger reallocation might get this
shard unstuck.
In any case, I would recommend upgrading the cluster to a more modern
version some time soon. There were significant improvements in the cluster
resiliency to these types of issues and you get much more control over
allocations.
On Tuesday, October 30, 2012 5:01:50 AM UTC-4, Stefan Pi wrote:
_cluster/health is exactly the same on all nodes (1 unassigned non-primary
shard, all other shards are assigned). I didn't see any "failed to execute
cluster state update". What we have are some of these:
[NodeName] [IndexName][2] failed to merge
java.lang.IndexOutOfBoundsException: Index: 116, Size: 23
and some of these:
[NodeName] [IndexName][0] failed to merge
org.apache.lucene.index.CorruptIndexException: docs out of order (187 <=
187 ) (out: org.elasticsearch.index.store.Store$StoreIndexOutput@38f21bf)
Do you think these are connected to the unassigned shard? I also wonder
where these are coming from.
Anyway, is there a way to force manually the assignment of a shard?
Thank you very much for your help,
Stefan
--