Two unassigned shards as failed to create engine with error no such file exception

Hi,

I am getting two shards unassigned and obviously the status is "RED"

I ran a cluster explanation API

failed shard on node [6PiZfydPRRi06v_8oISQaA]: failed recovery, failure RecoveryFailedException[[wazuh-archives-3.x-2019.03.21][0]: Recovery failed on {6PiZfyd}{6PiZfydPRRi06v_8oISQaA}{33L5TeBQRuqy7zQGXmLkNQ}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=8112173056, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: IndexShardRecoveryException[failed to recover from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: NoSuchFileException[/mnt/externaldrive/elasticsearch/nodes/0/indices/HMhWfsmASrevHvF6c8zE9g/0/index/_o76.si];

"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes that hold an in-sync shard copy"

shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2019-03-22T08:41:01.378Z], failed_attempts[5], delayed=false, details[failed shard on node [6PiZfydPRRi06v_8oISQaA]: failed recovery, failure RecoveryFailedException[[wazuh-archives-3.x-2019.03.21][0]: Recovery failed on {6PiZfyd}{6PiZfydPRRi06v_8oISQaA}{33L5TeBQRuqy7zQGXmLkNQ}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=8112173056, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: IndexShardRecoveryException[failed to recover from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: NoSuchFileException[/mnt/externaldrive/elasticsearch/nodes/0/indices/HMhWfsmASrevHvF6c8zE9g/0/index/_o76.si]; ], allocation_status[deciders_no]]]

I have manually called /_cluster/reroute?retry_failed=true via curl multiple times but no luck and now i am stuck at this please help me in this.

Thanks

If this file is really not there, I don't think there's a way to recover this shard. The best way forward is to restore it from a recent snapshot.

May i know the reason for this ?? Like why and how did it happen ?? what's the cause?

As i got it on my testing server and i don't have snapshot but if it happens in production it would be a big problem so i want to know what causes this so that i can take preventive measures.

It's hard to say. Something deleted this file, but it wasn't Elasticsearch. You should prevent things from deleting files managed by Elasticsearch.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.