I have 8 nodes of cluster 3 master node 3 data node and 2 coordinate node.Everyday i saw this Missing replica shards and manually i close those index and open and then refresh those index in kibana and my problem gets solve.Although no data node leaves the cluster then why it is happen
GET /_cluster/allocation/explain
{
"index" : "log-wlb-sysmon-2020.12.29",
"shard" : 1,
"primary" : false,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "ALLOCATION_FAILED",
"at" : "2020-12-29T01:39:59.630Z",
"failed_allocation_attempts" : 5,
"details" : "failed shard on node [voj77bzkQe-Dgzz9qiVudA]: failed recovery, failure RecoveryFailedException[[log-wlb-sysmon-2020.12.29][1]: Recovery failed from {ed3}{2BRhL-iTSeWCIx2fRH1jlA}{o7arVIoJSH-QEW2PbLOTmQ}{ed3}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false} into {ed2}{voj77bzkQe-Dgzz9qiVudA}{nHyE4sVaQBeF1hgs6QD0Xw}{ed2}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false}]; nested: RemoteTransportException[[ed3][XX.XX.XX.XX:9300][internal:index/shard/recovery/start_recovery]]; nested: CircuitBreakingException[[parent] Data too large, data for [internal:index/shard/recovery/start_recovery] would be [7357090166/6.8gb], which is larger than the limit of [7140383129/6.6gb], real usage: [7357087176/6.8gb], new bytes reserved: [2990/2.9kb], usages [request=0/0b, fielddata=2984808609/2.7gb, in_flight_requests=2990/2.9kb, model_inference=0/0b, accounting=240827968/229.6mb]]; ",
"last_allocation_status" : "no_attempt"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions" : [
{
"node_id" : "2BRhL-iTSeWCIx2fRH1jlA",
"node_name" : "ed3",
"transport_address" : "XX.XX.XX.XX:9300",
"node_attributes" : {
"xpack.installed" : "true",
"transform.node" : "false"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "max_retry",
"decision" : "NO",
"explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-12-29T01:39:59.630Z], failed_attempts[5], failed_nodes[[voj77bzkQe-Dgzz9qiVudA, pytohdtxQ-ywNaRIFnrLaw]], delayed=false, details[failed shard on node [voj77bzkQe-Dgzz9qiVudA]: failed recovery, failure RecoveryFailedException[[log-wlb-sysmon-2020.12.29][1]: Recovery failed from {ed3}{2BRhL-iTSeWCIx2fRH1jlA}{o7arVIoJSH-QEW2PbLOTmQ}{ed3}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false} into {ed2}{voj77bzkQe-Dgzz9qiVudA}{nHyE4sVaQBeF1hgs6QD0Xw}{ed2}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false}]; nested: RemoteTransportException[[ed3][XX.XX.XX.XX:9300][internal:index/shard/recovery/start_recovery]]; nested: CircuitBreakingException[[parent] Data too large, data for [internal:index/shard/recovery/start_recovery] would be [7357090166/6.8gb], which is larger than the limit of [7140383129/6.6gb], real usage: [7357087176/6.8gb], new bytes reserved: [2990/2.9kb], usages [request=0/0b, fielddata=2984808609/2.7gb, in_flight_requests=2990/2.9kb, model_inference=0/0b, accounting=240827968/229.6mb]]; ], allocation_status[no_attempt]]]"
},
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "a copy of this shard is already allocated to this node [[log-wlb-sysmon-2020.12.29][1], node[2BRhL-iTSeWCIx2fRH1jlA], [P], s[STARTED], a[id=YuD_poc8TZCq5nWjVoDZrw]]"
}
]
},
{
"node_id" : "pytohdtxQ-ywNaRIFnrLaw",
"node_name" : "ed1",
"transport_address" : "XX.XX.XX.XX:9300",
"node_attributes" : {
"xpack.installed" : "true",
"transform.node" : "false"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "max_retry",
"decision" : "NO",
"explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-12-29T01:39:59.630Z], failed_attempts[5], failed_nodes[[voj77bzkQe-Dgzz9qiVudA, pytohdtxQ-ywNaRIFnrLaw]], delayed=false, details[failed shard on node [voj77bzkQe-Dgzz9qiVudA]: failed recovery, failure RecoveryFailedException[[log-wlb-sysmon-2020.12.29][1]: Recovery failed from {ed3}{2BRhL-iTSeWCIx2fRH1jlA}{o7arVIoJSH-QEW2PbLOTmQ}{ed3}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false} into {ed2}{voj77bzkQe-Dgzz9qiVudA}{nHyE4sVaQBeF1hgs6QD0Xw}{ed2}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false}]; nested: RemoteTransportException[[ed3][XX.XX.XX.XX:9300][internal:index/shard/recovery/start_recovery]]; nested: CircuitBreakingException[[parent] Data too large, data for [internal:index/shard/recovery/start_recovery] would be [7357090166/6.8gb], which is larger than the limit of [7140383129/6.6gb], real usage: [7357087176/6.8gb], new bytes reserved: [2990/2.9kb], usages [request=0/0b, fielddata=2984808609/2.7gb, in_flight_requests=2990/2.9kb, model_inference=0/0b, accounting=240827968/229.6mb]]; ], allocation_status[no_attempt]]]"
}
]
},
{
"node_id" : "voj77bzkQe-Dgzz9qiVudA",
"node_name" : "ed2",
"transport_address" : "XX.XX.XX.XX:9300",
"node_attributes" : {
"xpack.installed" : "true",
"transform.node" : "false"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "max_retry",
"decision" : "NO",
"explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-12-29T01:39:59.630Z], failed_attempts[5], failed_nodes[[voj77bzkQe-Dgzz9qiVudA, pytohdtxQ-ywNaRIFnrLaw]], delayed=false, details[failed shard on node [voj77bzkQe-Dgzz9qiVudA]: failed recovery, failure RecoveryFailedException[[log-wlb-sysmon-2020.12.29][1]: Recovery failed from {ed3}{2BRhL-iTSeWCIx2fRH1jlA}{o7arVIoJSH-QEW2PbLOTmQ}{ed3}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false} into {ed2}{voj77bzkQe-Dgzz9qiVudA}{nHyE4sVaQBeF1hgs6QD0Xw}{ed2}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false}]; nested: RemoteTransportException[[ed3][XX.XX.XX.XX:9300][internal:index/shard/recovery/start_recovery]]; nested: CircuitBreakingException[[parent] Data too large, data for [internal:index/shard/recovery/start_recovery] would be [7357090166/6.8gb], which is larger than the limit of [7140383129/6.6gb], real usage: [7357087176/6.8gb], new bytes reserved: [2990/2.9kb], usages [request=0/0b, fielddata=2984808609/2.7gb, in_flight_requests=2990/2.9kb, model_inference=0/0b, accounting=240827968/229.6mb]]; ], allocation_status[no_attempt]]]"
}
]
}
]
}