Hello,
We are manually reallocating shards to active nodes using the Cluster Reroute API after one or more nodes go down and indices turn red. We basically find the unassigned shards for indices that are red and move the primary shards to a currently active node and ES automatically creates the replica shards. We want to ensure that we dont have any stale data lying around. The elasticsearch documentation on cluster reroute API states the following:
The allow_primary parameter will force a new empty primary shard to be allocated without any data. If a node which has a copy of the original primary shard (including data) rejoins the cluster later on, that data will be deleted: the old shard copy will be replaced by the new live shard copy.
Our question is:
If a node which has a copy of the original replica shard(including data) rejoins the cluster later on, Does that data gets deleted as well?
In other words,
Does ES clean up data for both primary and replica shards if they have been reallocated and later, the original node containing those shards comes back into the cluster?
ES seems to rebalance when one or more shards are available. For instance, if index is configured to have 2 copies of shards(one primary and one replica). If both primary and replica shards are lost due to two different nodes going down before the ES has a chance to rebalance. Then, the index will go red and never recover. We found that during our testing. We are manually reallocating the primary shard onto an active node and then ES automatically creates a replica shard. These have no data.
both primary and replica shards are lost due to two nodes(1 and 2) going down.
Both are re-assigned to active nodes using the reallocate API.
Nodes(1 and 2) that were down re-join the cluster.
The ES documentation says that the PRIMARY shard on node rejoining the cluster will be deleted. But, does not mention anything about the replica shard.
Does the replica shard gets deleted as well?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.