Active shards after closing indices

bartelastic · May 8, 2020, 8:36am

I see that after closing indices the shards are still reported as active (also see Elasticsearch 7.x closed indices retain shards ) and that was done to implement replicated closed indices feature .
Although I technically understand what to do now (increase max_shards_per_node), I was still wondering if it was debated whether another stat could be used for the feature, like the frozen status, which also seems to be attached to the closed indices.
Since Elasticsearch 7 the max_shards_per_node setting is mandatory and I thought it was to be able to tune on the active shards. But now the frozen and closed indices report their shards as active, while the shards have a small footprint and thus do not compare with the resource usage of active shards in open indices.
So the max_shards_per_node doesn't really say something about the resource usage and has to be increased artificially to give room for the shards of closed and frozen indices.

I am glad that we created monitoring for the percentage of allowed active shards

Just wondering whether any changes for this system were on the roadmap or what the deliberation on the pros and cons was.

Kind regards
Bart

DavidTurner · May 8, 2020, 9:27am

There were indeed discussions on whether to include closed and frozen indices in the count used by max_shards_per_node and it was a deliberate decision to include them, although there are good arguments in both directions. Closed and frozen indices are not completely free (e.g. they must be tracked by the master and recovered on failures) and the default max_shards_per_node limit is a very coarse safety feature to protect the cluster from really bad cases of oversharding rather than a recommended target.

Maintaining a large number of closed indices in your cluster is something of an antipattern. If you don't ever want to search them then it's better to offload them into a snapshot; if you do want to support occasional searches then it's better to freeze them.

For frozen indices the limit makes a bit more sense since searching a large number of small shards is relatively inefficient, so it's recommended to do some extra work before freezing to get the most out of your system. For instance, you can consolidate the data into fewer larger indices via reindex, shrink the indices to fewer larger shards, and force-merge them to a single segment.

bartelastic · May 8, 2020, 9:47am

Thanks for your reply! Ok, at the moment we are freezing them after a week and closing them a while later (time depending on whether it is our dev or prod environment). Maybe we shouldn't use the closed before deleting indices, as it adds little to the performance. And maybe do some extra work before freezing. For now it works well enough but data is growing, so it is best to keep that in mind.
Apparently, I thought the max_shards_per_node as a less rough feature than it was intended.
Thanks again for the clear explanation!

bartelastic · May 8, 2020, 10:04am

One more question, would it be an idea to specifically mention this on https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-total-shards.html https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-close.html and https://www.elastic.co/guide/en/elasticsearch/reference/current/freeze-index-api.html ?

DavidTurner · May 8, 2020, 10:32am

I think there is indeed a bit of scope to improve the docs.

I think https://www.elastic.co/guide/en/elasticsearch/reference/7.6/allocation-total-shards.html isn't relevant, that's to do with allocation rather than oversharding.
I was going to suggest improving https://www.elastic.co/guide/en/elasticsearch/reference/7.6/misc-cluster.html#cluster-shard-limit but I see that this actually already says that closed indices are not counted here, sorry. Should have checked that, I remember the discussions but clearly forgot the exact outcome
https://www.elastic.co/guide/en/elasticsearch/reference/7.6/best_practices.html does mention force-merging but could also reasonably mention shrinking, and could also point out that although frozen indices are lighter than non-frozen ones they can still be a source of oversharding (ref. https://www.elastic.co/guide/en/elasticsearch/reference/7.6/avoid-oversharding.html) and in particular that https://www.elastic.co/guide/en/elasticsearch/reference/6.8/misc-cluster.html#cluster-shard-limit applies to them too.
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-close.html could IMO bear a suggestion to consider snapshots or frozen indices instead of long-term closed indices.

All that would be subject to some internal discussion too, we have to be careful that the reference docs don't miss any subtleties, but a PR to do some or all of those changes would be welcome.

bartelastic · May 8, 2020, 10:44am

I'll work on that

system · June 5, 2020, 10:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch 7.x closed indices retain shards Elasticsearch	6	2612	July 26, 2019
Freezing Indicies Doesn't Improve Cluster Performance Elasticsearch	9	699	August 17, 2020
Frozen indices and shard limitation Elasticsearch	5	840	January 12, 2021
Max_shards_per_node best practice Elasticsearch ilm-index-lifecycle-management	3	270	May 7, 2024
Optimizing setup which contains a single cold-node Elasticsearch	4	400	July 22, 2020

Active shards after closing indices

Related topics