Graceful shard management?

Curtis-KC · February 18, 2021, 4:23pm

Most challenging problems occur "when things have gone wrong". We have Warm Nodes that are oversharded and have network access storage mounted by mistake. The result is we have a good amount of data already on NAS, but the impact to performance now destabilizes the whole 35 node cluster daily.

We want to devise a way to move shards off the affected Warm Nodes in a gradual way while at the same time preventing any new shards from being moved to the affected Node. We don't want to dump all shards at once the way the allocation.exclude filter does--just prevent any new shards appearing on the target Node. We have dozens of policies and templates which are not organized / standardized so we're hoping for a cluster wide solution.

Is there a way to simply say "don't put any new shards on this node"?

Thanks!

DavidTurner · February 18, 2021, 5:59pm

Hi Curtis!

No, sorry, roughly speaking at the cluster level either you want shards on a node, or you don't. You could apply an allocation filter only to the new indices perhaps?

Why do you need to do anything more gradually than Elasticsearch would do with an allocation filter that excludes the node? There's already controls in place to make sure that a node is evacuated slowly enough that it doesn't destabilise the cluster. Do you just not have the space to put them elsewhere right now?

Curtis-KC · February 18, 2021, 6:13pm

6 of our 10 Warm Nodes have 1 SSD and 1 NFS mount while the remaining 4 only have SSD. When we bring 1 of the NFS/SSD nodes down, the subsequent relocation of shards which includes the 5 other NFS/SSD nodes causes cascade failure of the remaining 5. The NFS can't keep up with the juggling.

DavidTurner · February 18, 2021, 6:25pm

That usually means your recovery settings are too aggressive. Often folks have cluster.routing.allocation.node_concurrent_recoveries and friends set far too high (2 is the default and that's a good number) and indices.recovery.max_bytes_per_sec is another popular setting to turn up to unreasonable levels. How are these configured for you?

Curtis-KC · February 18, 2021, 6:31pm

We are set for the default 2 concurrent recoveries and

'indices': {'recovery': {'max_bytes_per_sec': '375mb'}},

DavidTurner · February 18, 2021, 6:36pm

375MBps (equivalent to 3Gbps) seems pretty punchy, I'd recommend toning that down if your cluster struggles when moving shards around. I imagine you can leave it higher on the SSD-only nodes, assuming your network can cope at least.

system · March 18, 2021, 6:36pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Remove a Node controlling the number of shards reallocating Elasticsearch	3	391	April 8, 2019
Best practice to put a node under maintenance Elasticsearch	5	1311	July 6, 2017
How do I relocate shards from a node prior to shutting it down? Elasticsearch	2	916	July 6, 2017
Removing nodes from the cluster Elasticsearch	3	530	July 6, 2017
Shard allocation does not cause shards to move (ES 2.2.1) Elasticsearch	8	593	July 5, 2017

Graceful shard management?

Related topics