Node reduction - Should I use "primaries" or "none"

Archie1 · March 17, 2025, 12:28am

Hi Everyone,

I am reducing an Elasticsearch cluster from 6 nodes to 3.

I have already started the process and can see shard draining from the nodes.

Nodes were taken out of the pool with the correct command so that part is all good:

"cluster.routing.allocation.exclude._name": "<redacted>,<redacted>,<redacted>"

While the initial drain seemed to happen quite quickly, it appears to have slowed to crawl now, and I am wondering if I have used the right command to allow shards just on primaries. I have seen posts that recommend "none" for the command:

"cluster.routing.allocation.enable": "primaries"

When I set this to none, the cluster turns to RED and I get a lot of unassigned shards (I guess this is expected because they can't go anywhere with 'none') - But then I read posts that RED is not a bad thing during this process....

But.. will the cluster eventually sort itself out? Or am I being too impatient and I need to wait for the shards to drain naturally with the "primaries" command, and Elasticsearch will eventually work out (and then drop) the shards it does not need from the nodes being drained?

Thanks in advance!

leandrojmp · March 17, 2025, 1:26am

Which posts? If these are old posts they may be referring to old versions.

Normally you would set allocation to none during a restart to avoid shards moving when a node is know to be back online shortly, but I don't think that this is the recommendation for the past years as you can delay the allocation.

If you set it to none this means that no shards will be allowed to be allocate in the cluster, this includes the shards that you need to move from one node to another, so you cannot set it to none.

Which posts? They may also be referring to old-versions can you share some examples? A RED Cluster is not a desired state.

Which version are you running? You didn't say.

When you exlude a node from allocation, Elasticsearch will start moving out shards from the node, depending on the number of shards and their size this can take a lot of time, so you probably just need to wait.

How much data you have in those nodes? Does the 3 remaining nodes have enough space to receive all the data?

Archie1 · March 17, 2025, 2:17am

Thanks - I am running 6.8.23 - Looking to reduce the cluster to eventually go cloud, but that is some way off.

Thank you for confirming the none action. I suspect I just need to wait.

I have doubled the disk capacity of the receiving nodes, so space is not an issue, and I'm not hitting any watermark issues, etc.

The shards are moving very slowly, so as you point out, I likely just need to be patient with them.

The guide I followed was this one:

I know the ES documentation is available but it does lack a lot of examples and seems to assume prior knowledge, so this guide was a good one for me (apart from the "none" recommendation)

Current progress - so 'stuff' is happening:

shards  used  free  size  nodeip    node
392 	1.8tb 4.7tb 6.8tb 10.x.x.12 <redacted>-01
394 	1.8tb 4.7tb 6.8tb 10.x.x.14 <redacted>-02
394 	1.7tb 4.7tb 6.8tb 10.x.x.16 <redacted>-03
199 	1.1tb 2.5tb 3.8tb 10.x.x.13 <redacted>-04 <-- being removed
218 	1.1tb 2.4tb 3.8tb 10.x.x.15 <redacted>-05 <-- being removed
263 	1.4tb 2.2tb 3.8tb 10.x.x.17 <redacted>-06 <-- being removed

Having never done this before, I hope that the cluster will sort itself out and that eventually 04, 05, and 06 will get down to zero?

The cluster is green and happy at the moment. No issues.

Article about RED not being all bad

DavidTurner · March 17, 2025, 3:44am

You want cluster.routing.allocation.enable: all (the default). Setting this to primaries will prevent allocation of replicas, but you want to move all shards (both primaries and replicas). Setting it to none makes even less sense since this blocks all allocation.

red is always a bad thing. The article you linked is about how to fix it ASAP.

DavidTurner · March 17, 2025, 3:47am

Weird, no idea why this is saying to use none. You want all throughout.

Archie1 · March 17, 2025, 4:00am

Ah ok... so now I'm even more confused...

My understanding (likely very wrong now) was that having this set to "all" will still try to place shards onto removed nodes.

I suspect I'm reading old documents and now ES is smart enough to know that the hosts have been excluded.

I will change it to all. That might be the missing part of the puzzle.

Archie1 · March 17, 2025, 4:05am

YES! That was the issue...

Now it's powering through the shards...


shards  used  free  size  nodeip    node
395 	1.8tb 4.7tb 6.8tb 10.x.x.12 <redacted>-01
400 	1.8tb 4.7tb 6.8tb 10.x.x.14 <redacted>-02
395 	1.7tb 4.7tb 6.8tb 10.x.x.16 <redacted>-03
193 	1.1tb 2.5tb 3.8tb 10.x.x.13 <redacted>-04 <-- being removed
216 	1.1tb 2.4tb 3.8tb 10.x.x.15 <redacted>-05 <-- being removed
261 	1.4tb 2.2tb 3.8tb 10.x.x.17 <redacted>-06 <-- being removed

THANK YOU!!!

DavidTurner · March 17, 2025, 8:43am

You are using a pretty ancient version of ES but I don't think this advice has changed going back to the dawn of time. In contrast, the linked docs from Opster are dated 2024-12-16 which is pretty new. And Elastic bought Opster quite some time ago. These docs are just plain wrong unfortunately, I'm trying to work out how to fix them.

val · March 17, 2025, 8:53am

Apologies for this mistake, we'll fix it asap!

RainTown · March 17, 2025, 11:05am

Document now has "Updated: Mar 17, 2025 | 2 min read" so that was a quick fix. Well done.

Archie1 · March 17, 2025, 9:34pm

Great outcome. Thank you everyone!

Topic		Replies	Views
Explicitly allocating shards to nodes Elasticsearch	4	2670	July 6, 2017
Multiple Primaries of same index on same node in a cluster. Can I make one move? Elasticsearch	4	2477	July 6, 2017
Stop-start an elasticsearch instance having all the primary shards Elasticsearch	14	1112	March 19, 2020
Shard recovery with only one node in the cluster Elasticsearch	3	651	July 6, 2017
Way to route primary shards back to other nodes in case of data node failure in cluster Elasticsearch	3	566	July 6, 2017

Node reduction - Should I use "primaries" or "none"

Related topics