We are manually reallocating shards to active nodes using the Cluster Reroute API after one or more nodes go down and indices turn red. We basically find the unassigned shards for indices that are red and move the primary shards to a currently active node and ES automatically creates the replica shards. We want to ensure that we dont have any stale data lying around. The elasticsearch documentation on cluster reroute API states the following:
The allow_primary parameter will force a new empty primary shard to be allocated without any data. If a node which has a copy of the original primary shard (including data) rejoins the cluster later on, that data will be deleted: the old shard copy will be replaced by the new live shard copy.
Our question is:
If a node which has a copy of the original replica shard(including data) rejoins the cluster later on, Does that data gets deleted as well?
In other words,
Does ES clean up data for both primary and replica shards if they have been reallocated and later, the original node containing those shards comes back into the cluster?