During a rolling restart sometimes all replicas of a single shard go into PRIMARY_FAILED

I have an issue where sometimes during a rolling restart when it gets to a node that has a primary replica once the node is offline the replica shards go into a PRIMARY_FAILED state.

i.e

my-index               11 p UNASSIGNED NODE_LEFT 
my-index               11 r UNASSIGNED PRIMARY_FAILED
my-index               11 r UNASSIGNED PRIMARY_FAILED

This doesn't seem to happen all the time and I can't really find a way to make it consistently happen.

According to the documentation this means The shard was initializing as a replica, but the primary shard failed before the initialization completed.

How do I prevent this? I am restarting one node at a time and wiating for all shards to be allocated and have the cluster in a green state before moving onto the next node. Shard allocation is turned off before each node is taken down and turned back on when brought online.

I can't really find any documentation that says how to prevent this and I am following all the steps here Full-cluster restart and rolling restart | Elasticsearch Guide [8.1] | Elastic

So I am not really sure what is causing this. Any ideas or insights would be great!

Welcome to our community! :smiley:

What do the logs on your master node show at this time for that index?

Unfortunately not much.

I see shard allocation being turned off, the node leaving the cluster, logs about marking unavailable shards as stale (posted bellow), then a bit later the node joining the cluster and allocation being turned back on.

{"type": "server", "timestamp": "2022-03-09T22:40:52,226Z", "level": "WARN", "component": "o.e.c.r.a.AllocationService", "cluster.name": "my-cluster", "node.name": "my-cluster-es-master-4", "message": "[my-index][7] marking unavailable shards as stale: [cyzHRssCRd-PJ8FYF9zAGQ]", "cluster.uuid": "nVZb27XkRkmc5vsbGxqfng", "node.id": "jZz5usD4SSKR4hwrJRJCtw"  }
{"type": "server", "timestamp": "2022-03-09T22:40:53,107Z", "level": "WARN", "component": "o.e.c.r.a.AllocationService", "cluster.name": "my-cluster", "node.name": "my-cluster-es-master-4", "message": "[my-index][2] marking unavailable shards as stale: [iRQsE2dkQ2qgoKJup7PmFw]", "cluster.uuid": "nVZb27XkRkmc5vsbGxqfng", "node.id": "jZz5usD4SSKR4hwrJRJCtw"  }
{"type": "server", "timestamp": "2022-03-09T22:40:53,456Z", "level": "WARN", "component": "o.e.c.r.a.AllocationService", "cluster.name": "my-cluster", "node.name": "my-cluster-es-master-4", "message": "[my-index][6] marking unavailable shards as stale: [IABho5n9TJqOdYFy_K90Yw]", "cluster.uuid": "nVZb27XkRkmc5vsbGxqfng", "node.id": "jZz5usD4SSKR4hwrJRJCtw"  }
{"type": "server", "timestamp": "2022-03-09T22:40:53,491Z", "level": "WARN", "component": "o.e.c.r.a.AllocationService", "cluster.name": "my-cluster", "node.name": "my-cluster-es-master-4", "message": "[my-index][11] marking unavailable shards as stale: [ChD9fRVhSyeh7EuGT_N_Fg]", "cluster.uuid": "nVZb27XkRkmc5vsbGxqfng", "node.id": "jZz5usD4SSKR4hwrJRJCtw"  }

No logs in any of my master nodes about anything else. I'm using the default out of the box logging setup.

@warkolm Any ideas on what to check next?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.