Why replica shard is not allocated

I have 8 nodes of cluster 3 master node 3 data node and 2 coordinate node.Everyday i saw this Missing replica shards and manually i close those index and open and then refresh those index in kibana and my problem gets solve.Although no data node leaves the cluster then why it is happen


GET /_cluster/allocation/explain

{
  "index" : "log-wlb-sysmon-2020.12.29",
  "shard" : 1,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "ALLOCATION_FAILED",
    "at" : "2020-12-29T01:39:59.630Z",
    "failed_allocation_attempts" : 5,
    "details" : "failed shard on node [voj77bzkQe-Dgzz9qiVudA]: failed recovery, failure RecoveryFailedException[[log-wlb-sysmon-2020.12.29][1]: Recovery failed from {ed3}{2BRhL-iTSeWCIx2fRH1jlA}{o7arVIoJSH-QEW2PbLOTmQ}{ed3}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false} into {ed2}{voj77bzkQe-Dgzz9qiVudA}{nHyE4sVaQBeF1hgs6QD0Xw}{ed2}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false}]; nested: RemoteTransportException[[ed3][XX.XX.XX.XX:9300][internal:index/shard/recovery/start_recovery]]; nested: CircuitBreakingException[[parent] Data too large, data for [internal:index/shard/recovery/start_recovery] would be [7357090166/6.8gb], which is larger than the limit of [7140383129/6.6gb], real usage: [7357087176/6.8gb], new bytes reserved: [2990/2.9kb], usages [request=0/0b, fielddata=2984808609/2.7gb, in_flight_requests=2990/2.9kb, model_inference=0/0b, accounting=240827968/229.6mb]]; ",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "2BRhL-iTSeWCIx2fRH1jlA",
      "node_name" : "ed3",
      "transport_address" : "XX.XX.XX.XX:9300",
      "node_attributes" : {
        "xpack.installed" : "true",
        "transform.node" : "false"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "NO",
          "explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-12-29T01:39:59.630Z], failed_attempts[5], failed_nodes[[voj77bzkQe-Dgzz9qiVudA, pytohdtxQ-ywNaRIFnrLaw]], delayed=false, details[failed shard on node [voj77bzkQe-Dgzz9qiVudA]: failed recovery, failure RecoveryFailedException[[log-wlb-sysmon-2020.12.29][1]: Recovery failed from {ed3}{2BRhL-iTSeWCIx2fRH1jlA}{o7arVIoJSH-QEW2PbLOTmQ}{ed3}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false} into {ed2}{voj77bzkQe-Dgzz9qiVudA}{nHyE4sVaQBeF1hgs6QD0Xw}{ed2}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false}]; nested: RemoteTransportException[[ed3][XX.XX.XX.XX:9300][internal:index/shard/recovery/start_recovery]]; nested: CircuitBreakingException[[parent] Data too large, data for [internal:index/shard/recovery/start_recovery] would be [7357090166/6.8gb], which is larger than the limit of [7140383129/6.6gb], real usage: [7357087176/6.8gb], new bytes reserved: [2990/2.9kb], usages [request=0/0b, fielddata=2984808609/2.7gb, in_flight_requests=2990/2.9kb, model_inference=0/0b, accounting=240827968/229.6mb]]; ], allocation_status[no_attempt]]]"
        },
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "a copy of this shard is already allocated to this node [[log-wlb-sysmon-2020.12.29][1], node[2BRhL-iTSeWCIx2fRH1jlA], [P], s[STARTED], a[id=YuD_poc8TZCq5nWjVoDZrw]]"
        }
      ]
    },
    {
      "node_id" : "pytohdtxQ-ywNaRIFnrLaw",
      "node_name" : "ed1",
      "transport_address" : "XX.XX.XX.XX:9300",
      "node_attributes" : {
        "xpack.installed" : "true",
        "transform.node" : "false"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "NO",
          "explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-12-29T01:39:59.630Z], failed_attempts[5], failed_nodes[[voj77bzkQe-Dgzz9qiVudA, pytohdtxQ-ywNaRIFnrLaw]], delayed=false, details[failed shard on node [voj77bzkQe-Dgzz9qiVudA]: failed recovery, failure RecoveryFailedException[[log-wlb-sysmon-2020.12.29][1]: Recovery failed from {ed3}{2BRhL-iTSeWCIx2fRH1jlA}{o7arVIoJSH-QEW2PbLOTmQ}{ed3}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false} into {ed2}{voj77bzkQe-Dgzz9qiVudA}{nHyE4sVaQBeF1hgs6QD0Xw}{ed2}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false}]; nested: RemoteTransportException[[ed3][XX.XX.XX.XX:9300][internal:index/shard/recovery/start_recovery]]; nested: CircuitBreakingException[[parent] Data too large, data for [internal:index/shard/recovery/start_recovery] would be [7357090166/6.8gb], which is larger than the limit of [7140383129/6.6gb], real usage: [7357087176/6.8gb], new bytes reserved: [2990/2.9kb], usages [request=0/0b, fielddata=2984808609/2.7gb, in_flight_requests=2990/2.9kb, model_inference=0/0b, accounting=240827968/229.6mb]]; ], allocation_status[no_attempt]]]"
        }
      ]
    },
    {
      "node_id" : "voj77bzkQe-Dgzz9qiVudA",
      "node_name" : "ed2",
      "transport_address" : "XX.XX.XX.XX:9300",
      "node_attributes" : {
        "xpack.installed" : "true",
        "transform.node" : "false"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "NO",
          "explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-12-29T01:39:59.630Z], failed_attempts[5], failed_nodes[[voj77bzkQe-Dgzz9qiVudA, pytohdtxQ-ywNaRIFnrLaw]], delayed=false, details[failed shard on node [voj77bzkQe-Dgzz9qiVudA]: failed recovery, failure RecoveryFailedException[[log-wlb-sysmon-2020.12.29][1]: Recovery failed from {ed3}{2BRhL-iTSeWCIx2fRH1jlA}{o7arVIoJSH-QEW2PbLOTmQ}{ed3}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false} into {ed2}{voj77bzkQe-Dgzz9qiVudA}{nHyE4sVaQBeF1hgs6QD0Xw}{ed2}{XX.XX.XX.XX:9300}{d}{xpack.installed=true, transform.node=false}]; nested: RemoteTransportException[[ed3][XX.XX.XX.XX:9300][internal:index/shard/recovery/start_recovery]]; nested: CircuitBreakingException[[parent] Data too large, data for [internal:index/shard/recovery/start_recovery] would be [7357090166/6.8gb], which is larger than the limit of [7140383129/6.6gb], real usage: [7357087176/6.8gb], new bytes reserved: [2990/2.9kb], usages [request=0/0b, fielddata=2984808609/2.7gb, in_flight_requests=2990/2.9kb, model_inference=0/0b, accounting=240827968/229.6mb]]; ], allocation_status[no_attempt]]]"
        }
      ]
    }
  ]
}

I've never seen a circuit breaker for a recovery action!

Part of the issue seems to be that you have ~2400 shards across 3 nodes, which is pretty excessive. You should look to reduce that.

I have 11 indices each index has 3 shards and one replica from beginning i have created this but after one month i am facing this problem but we have space to allocate why it is not doing

Given the data volume in your cluster that is excessive. It is generally recommended to aim for a shard size measured in tens of GB and your average shard size is under 300MB, which is very, very small.

How long are you looking to retain data in your cluster? Is the shard count expected to grow?

It's not all that unusual, but recovery isn't a big memory consumer so we normally only see these if, as here, the cluster is already right on the edge for unrelated reasons (e.g. too many shards). Since 7.8.0 it isn't immediately fatal to recoveries any more, we added retries here:

Since this cluster is running 7.10.1 it means that we already did a bunch of retries and gave up because they all failed for similar reasons. The fix is, as mentioned above, to substantially reduce the shard count.

2 Likes

how did you determine shard size is 300mb as i said earlier i have 11 index and each index have different storage size like log-wlb-sysmon gets 10gb log per day and other index have 1 to 10mb.

The storage is not get full if it reaches to 80%(Total storage) we trigger curator.

Your stats above indicated 615GB data across 2358 shards, which is an average of 261MB.

If you have one index that is much larger than the others do not use the same settings across the board. For the smaller indices it would make sense to just have a single primary shard and go away from using daily indices.

The best way to do this is generally to use rollover with ILM. That way you can specify max size and age of indices and make sure you get larger shards as each index may cover a longer time period.

shard size=Total Shards/storage consumed (615/2358)
Sorry @Christian_Dahlqvist i am just new and i don't know how to calculate all these things like shard size

i uses time based indices each day new index is created.Before using ILM Policy in my production i need to ask some question

  1. if my one index size is of 122gb and i have 3 shards with one replicas, Is this is good
  2. other index is in few mb in size like 600mb 7 mb for this do i have to create one primary shard

Sounds reasonable as it is 122GB across 6 shards.

If you intend to keep your data in the cluster for an extended period of time I would recommend going down to a single primary shard but also switch to weekly or monthly indices (depends on your retention period).

Why it is happened i mean why replica shard is missing or not allocated.

I am planning to reduce the number of shards but i don't know what procedure i have to follow for 2358 shards

Do the following:

  • Look at each index pattern and change index templates to have the appropriate number of primary shards (1 for small indices).
  • Change you indexing so you switch to weekly or even monthly indices where appropriate.

This will give you a much better sharding setup going forward and will stop too many new shards being generated.

If you have a fixed retention period and this is relatively short you can choose to do nothing more and just wait for indices with small shards to be deleted as they age. This will minimize the amount of work required and will reduce the shard count over time.

If you however plan to keep data long and the indices with lots of very small shards are not likely to be deleted anytime soon you may need to:

  • Use the shrink index API to reduce the number of primary shards to 1 for small indices.
  • If you need to reduce shard count more you may need to use the reindex API to reindex small daily indices into larger monthly indices (with 1 primary shard) and then delete the old indices.
1 Like

We are planning to use shrink api for reducing shard size we want changes in our old indices because we have some dashboard that use old indices data and if we use new indices we have to make changes in our dashboard also

You should not need to make any changes in dashboards as long as the names of the new indices match the same index patterns.

hey @Christian_Dahlqvist is it possible to shrink all index like

POST log-pb-flow-*/_shrink/log-pb-flow_1
{
  "settings": {
    "index.number_of_replicas": 0,
    "index.number_of_shards": 1, 
    "index.codec": "best_compression" 
  }
} 

because i have 10 index of this index

health status index                  uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   log-pb-flow-2021.01.09 4wRBPWSbQ9m3KqRJ9D_c4A   6   0      58303            0     24.9mb         24.9mb
green  open   log-pb-flow-2021.01.08 syGfM1DHT86msmEPpfzK3g   6   0     131352            0     55.4mb         55.4mb
green  open   log-pb-flow-2021.01.07 aHMRR_qORgeKe4bptznBuw   6   0      19285            0      9.2mb          9.2mb
green  open   log-pb-flow-2021.01.13 bdsxYn4SSra89PQt9oIAtg   6   0       6497            0      4.5mb          4.5mb
green  open   log-pb-flow-2021.01.12 aL9zPpjCRQ6vnt3Ae_3ybg   6   0     134114            0     54.8mb         54.8mb
green  open   log-pb-flow-2021.01.11 8p1ownv-Su-bkKVtSp2syA   6   0      23369            0      9.2mb          9.2mb
green  open   log-pb-flow-2021.01.10 6aZYY-DfTOa445m8bnpMuw   6   0      20699            0      9.2mb          9.2mb
green  open   log-pb-flow-2021.01.16 GtdWFzrVQOa-5gYXqbwf-w   6   0       5333            0      3.3mb          3.3mb
green  open   log-pb-flow-2021.01.15 _kOms79nQ56Q5cuCymTDiw   6   0      74968            0     31.6mb         31.6mb
green  open   log-pb-flow-2021.01.14 47MggH4bRhqH1civvkZYbQ   6   0     146539            0     48.8mb         48.8mb