Could someone clarify how replicas of closed index shards are handled prior to 7.2?
I work with a cluster that's currently running Elasticsearch 6.8.6. I have always thought that closed index shards are replicated. I.e. If the index has one replica while open, then it still has one replica after it's closed.
When open, all the indices on our cluster have one replica. If I look on the disks of our Elasticsearch nodes for a directories containing shards of a given closed index, I find two copies of each shard. If I open a closed index, it's shards are (usually) all allocated without any copying of shards happening, which could only happen if Elasticsearch found a primary and replica of each shard on disk. (Sometimes, I think only when opening lots of indices at once requiring allocation of say 200-300 shards, some replicas are recreated by copying primaries.)
Is what's changed that Elasticsearch will now recreate shards of closed indices if the node they are on is removed, where as before 7.2 that wouldn't happen? Or something else?
Yes, that's absolutely right. Prior to 7.2.0 closed indices are not actively replicated. If you closed an index then all the copies remained on disk, so you were protected against some failures, but there was no process to replace a copy that was lost due to a failure. For instance, if you added some new data nodes and then shut all the old data nodes down then all closed indices would be lost and you wouldn't know about it until you tried to open them again.
Sounds good. Does it mean that with 7.2 onwards it's possible to use cluster.routing.allocation.exclude to tell the cluster to move data off a node and it will move the shards of closed indices as well as those of open indices?
I had to decommission some nodes a while back which contained lots of shards of closed indices. It was quite tedious because getting the shards of closed indices off the nodes meant opening an index, waiting for the shards to move, then closing it again. I ended up writing a script to handle it automatically, but it still took ages. (Just opening all the affected indices at once would have almost certainly killed the cluster!)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.