Shard Rebalancing Delay

I have a 5 node cluster running ES 2.4.0. supporting both development and staging environments. It is common for these nodes to be restarted, especially due to developmental changes etc.. I have one client node, two data-only nodes and two master-eligible nodes.

My problem is that when I restart even the ES service on a node, my cluster immediately freaks out and reallocates all the shards that were assigned to that node, even if the service was away for less than a minute. It then takes at least ten minutes for all the shards to recover and my cluster to return to a 'Green' status.

I've been looking at the documentation and thought I saw a delay you could put into ES but can't find it anymore. Does anyone know what modification I need to make in order for my ES cluster to not reallocate shards from a node unless the node has been down for (e.g. 5 minutes)?

I remembered seeing that too and had to go into my notes to find it as I couldn't see it in the docs either. Once you know what to look for it's easy to find.

index.unassigned.node_left.delayed_timeout

https://www.elastic.co/guide/en/elasticsearch/reference/current/delayed-allocation.html

Kimbro

1 Like

Exactly what I was looking for. Thanks a bunch Kimbro!