Following was the sequence of Events:
- node cluster(node= N-1) ( node itself was master)5 indexes : 12 shard each (total 60 shards)
- Scale up the cluster and add 2 replicas(N-2 and N-3)Replicas have initialized but the #ActiveShards is till 60(why not 180?)
- Master node(N-1) leaves the cluster (reason: shut_down)ActiveShards is till 60, cluster is green
- New master(N-3) is elected and cluster is still green
- The original node(N-1) rejoins the cluster(after ~50 mins) and accepts the new master(N-3).
- Soon after, cluster turns red and multiple primary and their secondary shards get unassigned.
I have the following questions:
1. When the 2 nodes(N-2 and N-3) joined the cluster(see step 2 above), cluster was green
but the number of ActiveShards was still 60(why not 180?)
2. Why did cluster turn red soon after the node(N-1) rejoined the cluster?
3. Node restart did not help or trigger re-allocation.
4. Cluster reroute not do anything. Why?
I could see the node (N-1) had the shards available on its disk, but ES was not recognizing these as valid shards.
Eventually, I had to set the 'index.recovery.initial_shards' to 1. Soon after all shards reassigned themselves. How and Why?