Hi David
It has quiet down. The health for the cluster is yellow, and it has stopped syncing at 99.98494089300505%
It seems that there are four indices that have problems.
All the green ones show:
{"status":"green","number_of_shards":5,"number_of_replicas":1,"active_primary_shards":5,"active_shards":10,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0},"filebeat-6.5.1-2016.05.14":
While the four yellow ones show:
{"status":"yellow","number_of_shards":5,"number_of_replicas":1,"active_primary_shards":5,"active_shards":9,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":1},"filebeat-6.5.1-2016.05.13":
So it has 1 active_shard missing on each of these four indices.
This is also shown on _cluster/health, it shows 4 unassigned shards.
This is the output from allocation:
{
"index": "filebeat-6.5.1-2017.09.20",
"shard": 3,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "ALLOCATION_FAILED",
"at": "2018-12-13T02:46:15.515Z",
"failed_allocation_attempts": 5,
"details": "failed shard on node [uEcPa_WSTF2hpMgqXIC1Ww]: failed recovery, failure RecoveryFailedException[[filebeat-6.5.1-2017.09.20][3]: Recovery failed from {es03}{utNZZop8SRuCuX_KZffjuw}{lDLyWs5ZR1mJHQ1pgIOQVw}{172.20.0.5}{172.20.0.5:9300}{ml.machine_memory=37779542016, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {es02}{uEcPa_WSTF2hpMgqXIC1Ww}{BF-RbwnKRzecVVNE_s8cvw}{172.20.0.3}{172.20.0.3:9300}{ml.machine_memory=37779542016, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[es03][172.20.0.5:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[es02][172.20.0.3:9300][internal:index/shard/recovery/prepare_translog]]; nested: TranslogCorruptedException[translog from source [/usr/share/elasticsearch/data/nodes/0/indices/9AHmqT9SRECymSKzcGZH1w/3/translog/translog-21.tlog] is corrupted, expected shard UUID [35 53 30 78 55 35 35 70 51 69 71 66 47 6a 67 41 72 4a 36 59 5a 41] but got: [71 42 70 57 69 52 7a 6e 52 4a 75 69 49 70 75 67 4b 34 43 51 36 67] this translog file belongs to a different translog]; ",
"last_allocation_status": "no_attempt"
},
"can_allocate": "no",
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions": [
{
"node_id": "RuY8hzATQLimTu4IQ0c--Q",
"node_name": "es01",
"transport_address": "172.20.0.4:9300",
"node_attributes": {
"ml.machine_memory": "37779542016",
"xpack.installed": "true",
"ml.max_open_jobs": "20",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "max_retry",
"decision": "NO",
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2018-12-13T02:46:15.515Z], failed_attempts[5], delayed=false, details[failed shard on node [uEcPa_WSTF2hpMgqXIC1Ww]: failed recovery, failure RecoveryFailedException[[filebeat-6.5.1-2017.09.20][3]: Recovery failed from {es03}{utNZZop8SRuCuX_KZffjuw}{lDLyWs5ZR1mJHQ1pgIOQVw}{172.20.0.5}{172.20.0.5:9300}{ml.machine_memory=37779542016, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {es02}{uEcPa_WSTF2hpMgqXIC1Ww}{BF-RbwnKRzecVVNE_s8cvw}{172.20.0.3}{172.20.0.3:9300}{ml.machine_memory=37779542016, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[es03][172.20.0.5:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[es02][172.20.0.3:9300][internal:index/shard/recovery/prepare_translog]]; nested: TranslogCorruptedException[translog from source [/usr/share/elasticsearch/data/nodes/0/indices/9AHmqT9SRECymSKzcGZH1w/3/translog/translog-21.tlog] is corrupted, expected shard UUID [35 53 30 78 55 35 35 70 51 69 71 66 47 6a 67 41 72 4a 36 59 5a 41] but got: [71 42 70 57 69 52 7a 6e 52 4a 75 69 49 70 75 67 4b 34 43 51 36 67] this translog file belongs to a different translog]; ], allocation_status[no_attempt]]]"
}
]
},
{
"node_id": "uEcPa_WSTF2hpMgqXIC1Ww",
"node_name": "es02",
"transport_address": "172.20.0.3:9300",
"node_attributes": {
"ml.machine_memory": "37779542016",
"ml.max_open_jobs": "20",
"xpack.installed": "true",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "max_retry",
"decision": "NO",
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2018-12-13T02:46:15.515Z], failed_attempts[5], delayed=false, details[failed shard on node [uEcPa_WSTF2hpMgqXIC1Ww]: failed recovery, failure RecoveryFailedException[[filebeat-6.5.1-2017.09.20][3]: Recovery failed from {es03}{utNZZop8SRuCuX_KZffjuw}{lDLyWs5ZR1mJHQ1pgIOQVw}{172.20.0.5}{172.20.0.5:9300}{ml.machine_memory=37779542016, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {es02}{uEcPa_WSTF2hpMgqXIC1Ww}{BF-RbwnKRzecVVNE_s8cvw}{172.20.0.3}{172.20.0.3:9300}{ml.machine_memory=37779542016, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[es03][172.20.0.5:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[es02][172.20.0.3:9300][internal:index/shard/recovery/prepare_translog]]; nested: TranslogCorruptedException[translog from source [/usr/share/elasticsearch/data/nodes/0/indices/9AHmqT9SRECymSKzcGZH1w/3/translog/translog-21.tlog] is corrupted, expected shard UUID [35 53 30 78 55 35 35 70 51 69 71 66 47 6a 67 41 72 4a 36 59 5a 41] but got: [71 42 70 57 69 52 7a 6e 52 4a 75 69 49 70 75 67 4b 34 43 51 36 67] this translog file belongs to a different translog]; ], allocation_status[no_attempt]]]"
}
]
},
{
"node_id": "utNZZop8SRuCuX_KZffjuw",
"node_name": "es03",
"transport_address": "172.20.0.5:9300",
"node_attributes": {
"ml.machine_memory": "37779542016",
"ml.max_open_jobs": "20",
"xpack.installed": "true",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [