We've had a crash and had to restart our server.
Our cluster has 5 nodes and we use 1 replica.
Since then, every time we try to open an index, the replica for the shard 3 (always 3!) gets stuck in initializing mode.
It even gets stuck when we create a new index (with zero document).
After that, ES tries to move it from one node to the other but it never succeeds.
We managed to fix it in a few cases by doing the following:
- Close the index
- ssh to the node holding the replica for shard 3 (the one that is stuck) and remove it from disk.
- copy the primary for shard 3 from one of the other nodes to the node currently holding the replica for shard 3.
This got us back to a working index, but every time a shard gets relocated, we get the same problem. We also can create any new index without going through all these steps.
We are using ES v1.4.2.
Is this a known problem?
Would upgrading to 1.7.2 be any help?