Shard failure after restart of node - ES 1.7.5

Hi. We have a three node cluster, running ES 1.7.5, with the following config.
index.number_of_shards: 5
index.number_of_replicas: 2
node.master: true true

After writing our data we see each node has five shards [0..4] as expected.
We perform a query and obtain the correct number of documents, etc...
We then stop nodes 1 and 2 and successfully re-query the cluster.
All well and good.

We then stop node 3 and then start node 3. We receive the error.
"SearchPhaseExecutionException[Failed to execute phase [query], all shards failed]"
We receive the same error no matter which node is started.

We interpret our configuration to mean we should be able to successfully query our cluster with only one node running.
Any ideas?

Thanks in advance.

Which version of Elasticsearch are you running?


I see the same behavior using 2.3.3

This is an issue where ES wants to have a "quorum" of shard copies available before it recovers on a cluster restart. You can set index.recovery.initial_shards to 1 so that it only waits for one shard copy to be available before the primary recovers.

As a side note, you could have also set the number of replicas to something less (like 0 or 1) and it would've also recovered your primary. ES is essentially waiting for enough nodes for index.recovery.initial_shards to be able to be met. If the default is quorum, which means for 3 shard copies, you would need 2 nodes to hold a quorum of those copies, then ES won't recover until those 2 nodes are up. If you set the number of replicas to 1 and keep the initial_shards setting to quorum, then you would have met the quorum by just starting one node.

@abeyad - Thanks for your insight it solved our issue. Setting index.number_of_replicas: 1 did not work it needed to be set to 0. Setting index.recovery.initial_shards to 1 worked.

Thanks again.