Unassigned Shards after Master node restart

Hello, I am new to elastic cluster, sorry for the long post and I am looking for help to revive some unassigned shards which are a result of me abruptly shutting the master node. This has an impact on Kibana as it has problem reaching to Elasticsearch service. I have 3 master nodes and 3 data nodes. These are on Kubernetes cluster. So, it uses persistent volumes(volumes that can retain information even after the restarts). All was good until 2 weeks ago. Due to few reasons, I had to restart the master and data nodes which resulted in a mess of volumes i.e, one of the data node and the master node failed to recognize the volumes. I assume it is to do with my way of restarting these nodes. Fast forward -> I was able to provide the volumes for these master and data nodes and I am trying to bring up the cluster but it fails.
Here are the checks that I did based on the previous posts and the blog links

The health on the cluster when executed against shards gives this result

Executing curl at _cluster/health?filter_path=status,*_shards
{
	"status": "red",
	"active_primary_shards": 5,
	"active_shards": 10,
	"relocating_shards": 0,
	"initializing_shards": 0,
	"unassigned_shards": 10,
	"delayed_unassigned_shards": 0
}

So, there are 10 unassigned shards. Looking into what are unassigned

_cat/shards?h=index,shard,prirep,state,unassigned.reason

logstash-2022.02.21         0 p STARTED    
logstash-2022.02.21         0 r STARTED    
.monitoring-es-7-2022.02.22 0 p STARTED    
.monitoring-es-7-2022.02.22 0 r STARTED    
.kibana_2                   0 p STARTED    
.kibana_2                   0 r STARTED    
.kibana_task_manager_1      0 r UNASSIGNED REPLICA_ADDED
.kibana_task_manager_1      0 p UNASSIGNED CLUSTER_RECOVERED
logstash-2022.02.22         0 r STARTED    
logstash-2022.02.22         0 p STARTED    
.security-7                 0 p UNASSIGNED CLUSTER_RECOVERED
.security-7                 0 r UNASSIGNED REPLICA_ADDED
.monitoring-es-7-2022.02.21 0 p STARTED    
.monitoring-es-7-2022.02.21 0 r STARTED    
.tasks                      0 r UNASSIGNED REPLICA_ADDED
.tasks                      0 p UNASSIGNED CLUSTER_RECOVERED
.apm-agent-configuration    0 p UNASSIGNED CLUSTER_RECOVERED
.apm-agent-configuration    0 r UNASSIGNED REPLICA_ADDED
.kibana_1                   0 r UNASSIGNED REPLICA_ADDED
.kibana_1                   0 p UNASSIGNED CLUSTER_RECOVERED

Looking at the 'started' indices, they are all on data nodes and I do not see any indices on master node now. I do believe they should have been there before I did changes. So, I do not even know to which nodes of master are these indices are related to.

Couple of other troubleshooting over the these unassigned shards -
The part of the output of _cluster/reroute?retry_failed=true and it does not help as well.
( I did not paste whole output here as it is too long)

{
	"acknowledged": true,
	"state": {
		"cluster_uuid": "EQQ472KzSX6QcuQCs3jRuw",
		"version": 30843835,
		"state_uuid": "pKPDGAXlRhmIV-6RIUtYmg",
		"master_node": "QfuHnzYfRS-e8hgkrLTQlQ",
		"blocks": {},
		"nodes": {
			"6zS21m_HQFaBJolZQdEj7g": {
				"name": "elastic-cluster-es-data-1",
				"ephemeral_id": "ZoB5E487SYidstzd0J-fjA",
				"transport_address": "10.233.93.153:9300",
				"attributes": {
					"xpack.installed": "true"
				}
			},
			"QfuHnzYfRS-e8hgkrLTQlQ": {
				"name": "elastic-cluster-es-master-0",
				"ephemeral_id": "ipSIsbcFTpuPgTvJ9JMZbw",
				"transport_address": "10.233.101.182:9300",
				"attributes": {
					"xpack.installed": "true"
				}
			},
		................
				}
			}
		},
		"routing_table": {
			"indices": {
				".security-7": {
					"shards": {
						"0": [{
							"state": "UNASSIGNED",
							"primary": true,
							"node": null,
							"relocating_node": null,
							"shard": 0,
							"index": ".security-7",
							"recovery_source": {
								"type": "EXISTING_STORE",
								"bootstrap_new_history_uuid": false
							},
							"unassigned_info": {
								"reason": "CLUSTER_RECOVERED",
								"at": "2022-02-21T18:15:40.290Z",
								"delayed": false,
								"allocation_status": "no_valid_shard_copy"
							}
						}, {
							"state": "UNASSIGNED",
							"primary": false,
							"node": null,
							"relocating_node": null,
							"shard": 0,
							"index": ".security-7",
							"recovery_source": {
								"type": "PEER"
							},
							"unassigned_info": {
								"reason": "REPLICA_ADDED",
								"at": "2022-02-21T18:17:23.298Z",
								"delayed": false,
								"allocation_status": "no_attempt"
							}
			.............

When I do shard stores

_shard_stores?pretty

{
  "error" : {
    "root_cause" : [
      {
        "type" : "master_not_discovered_exception",
        "reason" : null
      }
    ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}

Same is the result for _cat/nodes?v

Before executing the below, Checking on master nodes I do not find any indices in /usr/share/Elasticsearch/nodes/ but there were indices on data nodes. Now, after executing the command, I do not find anything under nodes folder. Not sure if the reroute command cleaned everything.
_cluster/reroute(according to Fix common cluster issues | Elasticsearch Guide [master] | Elastic)

I do not know what kind of issue I've gotten to, but any help is greatly appreciated.

Hello, Elastic team - Can you help here please?

TLDR you probably need to accept some data loss her and just delete the indices that are showing as red.

Thanks, the allocate to stale primary command, "allocate_stale_primary" worked and the cluster s back up

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.