Closed index shard replication prior to Elasticsearch 7.2

mikewillis · February 7, 2020, 11:48am

Could someone clarify how replicas of closed index shards are handled prior to 7.2?

I work with a cluster that's currently running Elasticsearch 6.8.6. I have always thought that closed index shards are replicated. I.e. If the index has one replica while open, then it still has one replica after it's closed.

But https://www.elastic.co/guide/en/elasticsearch/reference/current/release-highlights-7.2.0.html make a big thing of how closed index shards are now replicated. https://github.com/elastic/elasticsearch/issues/33888 makes it sound like closed index shards are not replicated.

When open, all the indices on our cluster have one replica. If I look on the disks of our Elasticsearch nodes for a directories containing shards of a given closed index, I find two copies of each shard. If I open a closed index, it's shards are (usually) all allocated without any copying of shards happening, which could only happen if Elasticsearch found a primary and replica of each shard on disk. (Sometimes, I think only when opening lots of indices at once requiring allocation of say 200-300 shards, some replicas are recreated by copying primaries.)

Is what's changed that Elasticsearch will now recreate shards of closed indices if the node they are on is removed, where as before 7.2 that wouldn't happen? Or something else?

DavidTurner · February 7, 2020, 12:41pm

Yes, that's absolutely right. Prior to 7.2.0 closed indices are not actively replicated. If you closed an index then all the copies remained on disk, so you were protected against some failures, but there was no process to replace a copy that was lost due to a failure. For instance, if you added some new data nodes and then shut all the old data nodes down then all closed indices would be lost and you wouldn't know about it until you tried to open them again.

mikewillis · February 7, 2020, 1:05pm

Sounds good. Does it mean that with 7.2 onwards it's possible to use cluster.routing.allocation.exclude to tell the cluster to move data off a node and it will move the shards of closed indices as well as those of open indices?

I had to decommission some nodes a while back which contained lots of shards of closed indices. It was quite tedious because getting the shards of closed indices off the nodes meant opening an index, waiting for the shards to move, then closing it again. I ended up writing a script to handle it automatically, but it still took ages. (Just opening all the affected indices at once would have almost certainly killed the cluster!)

DavidTurner · February 7, 2020, 1:08pm

Yep, that's also absolutely right.

system · March 6, 2020, 1:08pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch 7.x closed indices retain shards Elasticsearch	6	2592	July 26, 2019
Close Index Api warning in ES version 6.8 Elasticsearch	6	492	December 28, 2020
Closed indices don't survive node failures? Elasticsearch	2	443	July 5, 2017
Basic question about closed indices Elasticsearch	3	299	December 3, 2018
How does replication works in detail? Elasticsearch	4	3019	July 6, 2017

Closed index shard replication prior to Elasticsearch 7.2

Related topics