Stop-start an elasticsearch instance having all the primary shards

I have an Elasticsearch (v5.6.10) cluster with 3 nodes.

  • Node A : Master
  • Node B : Master + Data
  • Node C : Master + Data

There are 6 shards per data node with replication set as 1. All 6 primary nodes are in Node B and all 6 replicas are in Node C.

My requirement is to take out the Node B, do some maintenance work and put it back in cluster without any downtime.

I checked elastic documentation, discussion forum and stackoverflow questions. I found that I should execute the below request first in order to to allocate the shards on that node to the remaining nodes.

curl -XPUT localhost:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
  "transient" :{
      "cluster.routing.allocation.exclude._ip" : <Node B IP>
   }
}';echo

Once all the shards have been reallocated I can shutdown the node and do my maintenance work. Once I am done, I have to include the node again for allocation and Elasticsearch will rebalance the shards again.

Now I also found another discussion where the user faced an issue of yellow/red cluster health issue due to having only one data node but wrongly setting replication as one, causing unassigned shards. It seems to me while doing this exercise I am taking my cluster towards that state.

So, my concern here is whether I am following the correct way keeping in mind all my primary shards are in the node (Node B) which I am taking out of a cluster having replication factor as 1.

You will get stuck here. You only have two data nodes, and each shard has two copies (one primary and one replica) so the reallocation won't do anything.

Temporarily unassigned shards aren't normally a big deal. Just shut the node down, do your maintenance, then bring it back up again. No need to reallocate anything. The cluster health will be yellow for a while but that's ok, you can still search and index as normal.

1 Like

Thanks for the guidance. I will continue as per your suggestion. Just one question though. The node (Node B) I am planning to shutdown contains all the primary shards, so once I take it out of cluster I will only have one node (Node C) with all the replicas. Won't it cause any issue while indexing ?

The replicas on node C will be promoted to primaries immediately on node B's departure, so indexing should be unaffected.

Thanks a lot. I believe I can follow the same strategy if I need to take out Node C and even Node A (Master only) in future for maintenance.

I believe I can follow the same strategy for Node C as that is almost same as Node B, only holding the replica shards instead of the primary ones. Please confirm.

But for Node A (master only), I believe if I take that out of cluster one of the nodes out of B or C will become master. Next when I join Node A back, will it become master again ? If not then what purpose will it serve then ?

Correct.

No, probably not.

Redundancy. It's there in case something goes wrong with the one of the other nodes.

Thanks.

I also came to know that there is Kibana instance running in Node B pointing to the elastic node in the same machine. I hope stopping kibana and starting it after starting the elastic back again will suffice.

The primary shard for the .kibana index of Kibana is there in Node B along with all other primary shards. Node C has the .kibana index replica shard. Now once I take Node B out, all the replica shards in Node C will become primary along with the .kibana one too.

Now once I add Node B back, will it hold only replica shards now ? If yes then Kibana will read from the .kibana index replica shard now. Will it cause any difference ?

No, that's fine. You really don't need to worry about the allocation of primaries vs replicas as you are doing. Elasticsearch handles all this for you.

Thanks @DavidTurner

@DavidTurner Sorry to bother you again regarding the same issue. I understand elastic takes care of allocation of primary and replica for me and I don't need to worry about that.

But once I take Node B out, all the replica shards in Node C will be promoted to primary including the .kibana one. Now when I join Node B again, it will hold the replica shards, so Kibana will now point to the replica copy of .kibana index instead of the primary one as before. Will it cause any issue ?

No, as I said above, that's fine. It would be pretty hard to use otherwise. As I said, Elasticsearch handles this all for you.

@DavidTurner Thanks a lot for answering all my queries.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.