Closed index shard replication prior to Elasticsearch 7.2

Could someone clarify how replicas of closed index shards are handled prior to 7.2?

I work with a cluster that's currently running Elasticsearch 6.8.6. I have always thought that closed index shards are replicated. I.e. If the index has one replica while open, then it still has one replica after it's closed.

But make a big thing of how closed index shards are now replicated. makes it sound like closed index shards are not replicated.

When open, all the indices on our cluster have one replica. If I look on the disks of our Elasticsearch nodes for a directories containing shards of a given closed index, I find two copies of each shard. If I open a closed index, it's shards are (usually) all allocated without any copying of shards happening, which could only happen if Elasticsearch found a primary and replica of each shard on disk. (Sometimes, I think only when opening lots of indices at once requiring allocation of say 200-300 shards, some replicas are recreated by copying primaries.)

Is what's changed that Elasticsearch will now recreate shards of closed indices if the node they are on is removed, where as before 7.2 that wouldn't happen? Or something else?

Yes, that's absolutely right. Prior to 7.2.0 closed indices are not actively replicated. If you closed an index then all the copies remained on disk, so you were protected against some failures, but there was no process to replace a copy that was lost due to a failure. For instance, if you added some new data nodes and then shut all the old data nodes down then all closed indices would be lost and you wouldn't know about it until you tried to open them again.

Sounds good. Does it mean that with 7.2 onwards it's possible to use cluster.routing.allocation.exclude to tell the cluster to move data off a node and it will move the shards of closed indices as well as those of open indices?

I had to decommission some nodes a while back which contained lots of shards of closed indices. It was quite tedious because getting the shards of closed indices off the nodes meant opening an index, waiting for the shards to move, then closing it again. I ended up writing a script to handle it automatically, but it still took ages. (Just opening all the affected indices at once would have almost certainly killed the cluster!)

Yep, that's also absolutely right.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.