What happen to replicas after rolling restart?

Hi,

We follow these steps to restart our cluster https://www.elastic.co/guide/en/elasticsearch/guide/current/_rolling_restarts.html .
What make us confuse is the replicas‘ recovery. Do replicas lost in restart and need newly allocation?Or just do consistency checking on them.

Thanks & regards.
Shiny

What version are you asking for, cause it changes between 1.5>1.6

Now using 1.7.1 @warkolm

1 Like

Then they don't get removed, it uses synced flush which is kinda like consistency checking.

1 Like

If we keep “cluster.routing.allocation.disable_allocation : true” ,and allocate the replicas using command like this:

 curl -XPOST 127.0.0.1:9200/_cluster/reroute -d '{
  "commands" : [ {
        "allocate" :
            {
              "index" : "myIndex", "shard" : 1, "node" : "node_where_replica_in_before_restart"
            }
        }
  ]
}'

Will the replica not removed and recovery uses synced flush?

Not unless that replica already exists on that node, if it doesn't then it needs to be completely copied over.

Why would you do that though?

1 Like

Thx for your anwser, that's very helpful~

We are considering a solutions which keep disable allocation and custom allocate all the shards. So that's the problem we met.

1 Like

I'm also curious about the replicas recovery in 1.5 . Can you tell me more about this in elder version?

1 Like

Pre 1.6 we essentially copied the entire shard over from scratch, ignoring anything the node may already have locally.

But again, why? ES handles this all for you.

1 Like

Well...Our project manager do not trust any balance mechanism (Mongodb hurt his heart).So we can only control the shards ourselves. Although we show him how well it work do not help. QAQ

All in all, thx very much~

1 Like

That's going to lead to larger problems.....

1 Like

It's usually best to leverage the features and abilities of your tools to your advantage rather than to manage things of this nature manually. Elasticsearch handles allocation quite well automatically if used correctly. The more recent the version, usually the better things will run due to bug fixes and enhancements.

But I understand that you're under specific instructions to do it this way, all you can do is communicate that this isn't the recommended way and will lead to bigger issues the longer you do it this way and as your data/cluster grows.