Hi All!
Wondering if anyone has suggestions. We've got a cluster that current keeps running into problems with shards being unassigned. Manually allocating the shard fails with the error "structure needs cleaning" (see logs below). Anyone have any idea on what causes this and what works best to recover?
I saw this prior post - unassigned-shards-in-10-node-cluster which seems to lead to a dead end. Don't know if OP @ivten has any updates from the prior case.
Error message:
{
"index": "logstash-2018.02.05",
"shard": 6,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "ALLOCATION_FAILED",
"at": "2018-02-05T01:44:10.312Z",
"failed_allocation_attempts": 5,
"details": "failed recovery, failure RecoveryFailedException[[logstash-2018.02.05][6]: Recovery failed from {elasticsearch_data_54}{cB904jlPS3WltGsM885a0g}{pxFRvAldTKGpohmz3LzCuQ}{redacted_ip}{redacted_ip:9300} into {elasticsearch_data_45}{GVKKtQsiQ6a2MCqUt7HFXw}{6HztnYV3QV2AC6q-cLCR8A}{redacted_ip}{redacted_ip:9300}]; nested: RemoteTransportException[[elasticsearch_data_54][redacted_ip:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [131] files with total size of [434.7mb]]; nested: RemoteTransportException[[elasticsearch_data_45][redacted_ip:9300][internal:index/shard/recovery/file_chunk]]; nested: IOException[Structure needs cleaning]; ",
"last_allocation_status": "no_attempt"
}