Unassigned Shards after Master node restart

krishnav1 · February 22, 2022, 4:11pm

Hello, I am new to elastic cluster, sorry for the long post and I am looking for help to revive some unassigned shards which are a result of me abruptly shutting the master node. This has an impact on Kibana as it has problem reaching to Elasticsearch service. I have 3 master nodes and 3 data nodes. These are on Kubernetes cluster. So, it uses persistent volumes(volumes that can retain information even after the restarts). All was good until 2 weeks ago. Due to few reasons, I had to restart the master and data nodes which resulted in a mess of volumes i.e, one of the data node and the master node failed to recognize the volumes. I assume it is to do with my way of restarting these nodes. Fast forward -> I was able to provide the volumes for these master and data nodes and I am trying to bring up the cluster but it fails.
Here are the checks that I did based on the previous posts and the blog links

The health on the cluster when executed against shards gives this result

Executing curl at _cluster/health?filter_path=status,*_shards
{
	"status": "red",
	"active_primary_shards": 5,
	"active_shards": 10,
	"relocating_shards": 0,
	"initializing_shards": 0,
	"unassigned_shards": 10,
	"delayed_unassigned_shards": 0
}

So, there are 10 unassigned shards. Looking into what are unassigned

_cat/shards?h=index,shard,prirep,state,unassigned.reason

logstash-2022.02.21         0 p STARTED    
logstash-2022.02.21         0 r STARTED    
.monitoring-es-7-2022.02.22 0 p STARTED    
.monitoring-es-7-2022.02.22 0 r STARTED    
.kibana_2                   0 p STARTED    
.kibana_2                   0 r STARTED    
.kibana_task_manager_1      0 r UNASSIGNED REPLICA_ADDED
.kibana_task_manager_1      0 p UNASSIGNED CLUSTER_RECOVERED
logstash-2022.02.22         0 r STARTED    
logstash-2022.02.22         0 p STARTED    
.security-7                 0 p UNASSIGNED CLUSTER_RECOVERED
.security-7                 0 r UNASSIGNED REPLICA_ADDED
.monitoring-es-7-2022.02.21 0 p STARTED    
.monitoring-es-7-2022.02.21 0 r STARTED    
.tasks                      0 r UNASSIGNED REPLICA_ADDED
.tasks                      0 p UNASSIGNED CLUSTER_RECOVERED
.apm-agent-configuration    0 p UNASSIGNED CLUSTER_RECOVERED
.apm-agent-configuration    0 r UNASSIGNED REPLICA_ADDED
.kibana_1                   0 r UNASSIGNED REPLICA_ADDED
.kibana_1                   0 p UNASSIGNED CLUSTER_RECOVERED

Looking at the 'started' indices, they are all on data nodes and I do not see any indices on master node now. I do believe they should have been there before I did changes. So, I do not even know to which nodes of master are these indices are related to.

Couple of other troubleshooting over the these unassigned shards -
The part of the output of _cluster/reroute?retry_failed=true and it does not help as well.
( I did not paste whole output here as it is too long)

{
	"acknowledged": true,
	"state": {
		"cluster_uuid": "EQQ472KzSX6QcuQCs3jRuw",
		"version": 30843835,
		"state_uuid": "pKPDGAXlRhmIV-6RIUtYmg",
		"master_node": "QfuHnzYfRS-e8hgkrLTQlQ",
		"blocks": {},
		"nodes": {
			"6zS21m_HQFaBJolZQdEj7g": {
				"name": "elastic-cluster-es-data-1",
				"ephemeral_id": "ZoB5E487SYidstzd0J-fjA",
				"transport_address": "10.233.93.153:9300",
				"attributes": {
					"xpack.installed": "true"
				}
			},
			"QfuHnzYfRS-e8hgkrLTQlQ": {
				"name": "elastic-cluster-es-master-0",
				"ephemeral_id": "ipSIsbcFTpuPgTvJ9JMZbw",
				"transport_address": "10.233.101.182:9300",
				"attributes": {
					"xpack.installed": "true"
				}
			},
		................
				}
			}
		},
		"routing_table": {
			"indices": {
				".security-7": {
					"shards": {
						"0": [{
							"state": "UNASSIGNED",
							"primary": true,
							"node": null,
							"relocating_node": null,
							"shard": 0,
							"index": ".security-7",
							"recovery_source": {
								"type": "EXISTING_STORE",
								"bootstrap_new_history_uuid": false
							},
							"unassigned_info": {
								"reason": "CLUSTER_RECOVERED",
								"at": "2022-02-21T18:15:40.290Z",
								"delayed": false,
								"allocation_status": "no_valid_shard_copy"
							}
						}, {
							"state": "UNASSIGNED",
							"primary": false,
							"node": null,
							"relocating_node": null,
							"shard": 0,
							"index": ".security-7",
							"recovery_source": {
								"type": "PEER"
							},
							"unassigned_info": {
								"reason": "REPLICA_ADDED",
								"at": "2022-02-21T18:17:23.298Z",
								"delayed": false,
								"allocation_status": "no_attempt"
							}
			.............

When I do shard stores

_shard_stores?pretty

{
  "error" : {
    "root_cause" : [
      {
        "type" : "master_not_discovered_exception",
        "reason" : null
      }
    ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}

Same is the result for _cat/nodes?v

Before executing the below, Checking on master nodes I do not find any indices in /usr/share/Elasticsearch/nodes/ but there were indices on data nodes. Now, after executing the command, I do not find anything under nodes folder. Not sure if the reroute command cleaned everything.
_cluster/reroute(according to Fix common cluster issues | Elasticsearch Guide [master] | Elastic)

I do not know what kind of issue I've gotten to, but any help is greatly appreciated.

krishnav1 · February 22, 2022, 10:19pm

Hello, Elastic team - Can you help here please?

warkolm · February 22, 2022, 10:51pm

TLDR you probably need to accept some data loss her and just delete the indices that are showing as red.

krishnav1 · February 24, 2022, 11:37am

Thanks, the allocate to stale primary command, "allocate_stale_primary" worked and the cluster s back up

system · March 24, 2022, 11:38am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Nearly 50% shards marked as 'unassigned' after cluster restart Elasticsearch	3	444	September 27, 2019
Cluster health RED, UNASSIGNED shards from CLUSTER_RECOVERED Elasticsearch	5	3221	June 1, 2018
Unassigned shards, crashed cluster recovery Elasticsearch	9	13024	February 2, 2018
[1.7.2] unassigned shards after restart for indexes with no replication Elasticsearch	7	1870	July 5, 2017
All Shards Unassigned due to Data Node Restarts Elasticsearch	3	1577	May 28, 2019

Unassigned Shards after Master node restart

Related topics