Shards distribution implements after upgrade

Hello,

I was informed that there was a vulnerability [CVE-2025-25012] in older versions of ELK so I set out to go through the whole process to upgrade to the latest version. It is the second time I do it and I am afraid because something always goes wrong like the yellow status.

I have a cluster of 3 elastic servers
A server for Kibana
One server for logstas
One server for fleet Server

I upgraded from version 8.15.3 to version 8.17.4 and in the last step where the redistribution of the shards is done, it stayed at 99.3%.

GET _cat/health?v=true
image

GET _cat/shards?v=true&h=index,shard,prirep,state,node,unassigned.reason&s=state

According to the documentation I cannot update the other elastic servers until I finish the shard redistribution.

Thank you for your recommendations

The relevant troubleshooting guide is here. If you need help understanding any of the steps, please let us know.

1 Like

Hello @DavidTurner, thanks for your answer, I had just reviewed this documentation but for my current level of knowledge I consider that it is very dense or to be more specific I do not understand very well all that is mentioned there and its implications, maybe you think it is too clear but I consider that it is not so, at least for me.

From what I searched on the internet I found this command “GET /_cluster/allocation/explain” and this within all the information that shows this clarification catches my attention

“cannot allocate replica shard to a node with version [8.15.3] since this is older than the primary version [8.17.4]”

It doesn't make sense, this is normal because I am updating each server one at a time

According to the above, executing a “POST /_cluster/reroute?retry_failed” which I understand is to try again to perform the shard re-distribution would not solve the problem.

So as a recommendation or suggestion, in this case the documentation is very general and is more a documentation of experts for experts, so I turn to the forum looking for suggestions to solve the problem I have under the specific conditions that I manifest. I don't want you to misunderstand me because I don't mean it in the bad sense of the word, just that it is important to keep in mind that not all Elastic Stack users have the same level of knowledge or certifications, sometimes someone simply leaves a position and is handed over to another who has never used ELK in his life.

Ah yeah that can happen until you upgrade the second node. As long as that's the only thing blocking these shards from allocating, you should be good to proceed.

1 Like

@DavidTurner So I simply perform the upgrade of the second cluster server and skip doing the reroute step?

What is the reroute step? Sorry, not sure what you mean by that.

Otherwise yes, it will be ok to upgrade the next node since all of the missing shards have a primary on the node you've already upgraded and therefore the cluster health will remain yellow and not go red.

1 Like

reroute forces the shard resitribution to be done again. But it was not necessary.

As you said when updating the second elastic server, the distribution was already 100% complete.

thank you very much

image

Oh right you mean POST _cluster/reroute? That is never necessary unless GET /_cluster/allocation/explain tells you it's needed.

1 Like