Master has not removed previously failed shard. resending shard failure

hhalei · July 15, 2022, 1:09pm

ENV

ElasticSearch Version: 7.8.0

Description:

We run a cluster with three node, Not long ago, the file system of one node was damaged and forced to go offline. Then the cluster ran with two nodes for a period of time, and the cluster status has always been green，But recently, we encountered the following errors when writing data：

{"_index":"index_xxx","_type":"_doc","_id":"xxx","status":404,"error":{"type":"shard_not_found_exception","reason":"no such shard","index_uuid":"taBxiWo6RhWWaRG3Ainy4A","shard":"0","index":"index_xxx"}}

After checking, we can see that the cluster status is still green， and there has no error in elasticsearch server log 。

Then, we try to add a new node to the cluster. When the new node is successfully added to the cluster, we see the following error messages in the server log of the master node：

[2022-07-15T19:18:16,727][WARN ][o.e.c.r.a.AllocationService] [node_ip] failing shard [failed shard, shard [index_xxx][1], node[6U4Yf_FYRJqnw06IWG82DQ], [P], s[STARTED], a[id=ScXTxC5PSDOVxgi78PJQwg], message [master {master_node}{RizeqIPITtu_fmtHRo-xhQ}{pjyrxWldTdWh8SwhOLNQ4Q}{master_node}{master_node:tcp_port}{dilmrt}{ml.machine_memory=134608896000, ml.max_open_jobs=20, xpack.installed=true, transform.node=true} has not removed previously failed shard. resending shard failure], failure [Unknown], markAsStale [true]]
[2022-07-15T19:18:16,736][INFO ][o.e.c.r.a.AllocationService] [master_node] Cluster health status changed from [GREEN] to [YELLOW] (reason: [shards failed [[index_xxx][1], [index_xxx][0], [index_xxx][4], ... [12 items in total]]]).

Next, we see that the cluster state changes from green to yellow, and some shards are in the repair state。

Question

Is there already some problem of those shards befor we add new nodes? If so, why does the cluster status always show [Green] ?

Please help, thanks !

warkolm · July 20, 2022, 10:54pm

Welcome to our community!

7.8 has been EOL for some time and is unsupported, are you able to upgrade to a supported version?

system · August 17, 2022, 10:54pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Data node removed; master_failed Elasticsearch	7	1583	July 4, 2017
Shards UNASSIGNED even tho they exist on disk Elasticsearch	3	523	July 6, 2017
Shards Unavailable after some time Elasticsearch	2	729	July 6, 2017
What is 'marked shard as started, but shard has not been created, mark shard as failed"? Elasticsearch	1	665	July 6, 2017
Master node failure causes cluster to fail Elasticsearch	3	1645	July 6, 2017

Master has not removed previously failed shard. resending shard failure

ENV

Description:

Question

Related topics