Enabling shard allocation does not enable it for follower indices

Hi there!

So, I encountered a weird behavior where I'm not sure if this is my fault or not.
We have this scenario, two DCs (DC1 and DC2). DC2 follows DC1.
So, when we upgrade something we usually disable shard allocation on the affected node and enable it after we are done - which is also stated in the docs (see first point)

Today, while upgrading DC2, DC1 created a new index, replication kicked in and created that index on DC2 - shard allocation was disabled there. Because of that no primary shards or replica shards got allocated. Which makes sense, it's disabled after all.
But now the weird thing comes, after enabling shard allocation, the follower index did not allocate a single shard at all. now this index in red state therefor the whole cluster is red now.

Explain allocation returns this:

{
  "index" : "ccr-...",
  "shard" : 1,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "NEW_INDEX_RESTORED",
    "at" : "2020-01-29T20:44:52.417Z",
    "details" : "restore_source[_ccr_fee8f8dc-ae0d-5408-8123-0e875ec0dbcf_datacenter2/_latest_]",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [ {
    "node_id" : "71xUZS1-SJiTxJmZ7HhMlA",
    "node_name" : "10.....214",
    "transport_address" : "10.....214:9300",
    "node_attributes" : {
      "ml.machine_memory" : "32154439680",
      "ml.max_open_jobs" : "20",
      "xpack.installed" : "true",
      "ml.enabled" : "true"
    },
    "node_decision" : "no",
    "deciders" : [ {
      "decider" : "replica_after_primary_active",
      "decision" : "NO",
      "explanation" : "primary shard for this replica is not yet active"
    }, {
      "decider" : "throttling",
      "decision" : "NO",
      "explanation" : "primary shard for this replica is not yet active"
    } ]
  }, {
    "node_id" : "Lex1Z0aXSkWidZQPmhnRxw",
    "node_name" : "10.....200",
    "transport_address" : "10.....200:9300",
    "node_attributes" : {
      "ml.machine_memory" : "32154439680",
      "ml.max_open_jobs" : "20",
      "xpack.installed" : "true",
      "ml.enabled" : "true"
    },
    "node_decision" : "no",
    "deciders" : [ {
      "decider" : "replica_after_primary_active",
      "decision" : "NO",
      "explanation" : "primary shard for this replica is not yet active"
    }, {
      "decider" : "throttling",
      "decision" : "NO",
      "explanation" : "primary shard for this replica is not yet active"
    } ]
  }, {
    "node_id" : "ZOpwEdM1RC-MgYgSRXREwQ",
    "node_name" : "10.....167",
    "transport_address" : "10.....167:9300",
    "node_attributes" : {
      "ml.machine_memory" : "32154439680",
      "ml.max_open_jobs" : "20",
      "xpack.installed" : "true",
      "ml.enabled" : "true"
    },
    "node_decision" : "no",
    "deciders" : [ {
      "decider" : "replica_after_primary_active",
      "decision" : "NO",
      "explanation" : "primary shard for this replica is not yet active"
    }, {
      "decider" : "throttling",
      "decision" : "NO",
      "explanation" : "primary shard for this replica is not yet active"
    } ]
  } ]
}

I cannot prevent creating new indices.
My question is, what is the proper way to prevent that? Why does elasticsearch not allocate the shards of a follower when we enable shard allocation?

Thanks in advance!

Hi @kley,

the allocation explain output does not contain the root cause, since it explains the replica. When not passing in any shard to explain, it will pick a random unassigned shard.

You should be able to explain the specific shard using:

GET _cluster/allocation/explain
{
  "index": "ccr-...",
  "shard": 0,
  "primary": true
}

Maybe that includes the reason for not allocating the primary?

Hey @HenningAndersen!
thanks for the fast response.
I ran your curl and it returned this:

{
    "index" : "ccr-...",
    "shard" : 0,
    "primary" : true,
    "current_state" : "unassigned",
    "unassigned_info" : {
      "reason" : "NEW_INDEX_RESTORED",
      "at" : "2020-01-29T20:44:52.417Z",
      "details" : "restore_source[_ccr_fee8f8dc-ae0d-5408-8123-0e875ec0dbcf_datacenter2/_latest_]",
      "last_allocation_status" : "no"
    },
    "can_allocate" : "no",
    "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
    "node_allocation_decisions" : [
      {
        "node_id" : "71xUZS1-SJiTxJmZ7HhMlA",
        "node_name" : "10.....214",
        "transport_address" : "10.....214:9300",
        "node_attributes" : {
          "ml.machine_memory" : "32154439680",
          "ml.max_open_jobs" : "20",
          "xpack.installed" : "true",
          "ml.enabled" : "true"
        },
        "node_decision" : "no",
        "weight_ranking" : 1,
        "deciders" : [
          {
            "decider" : "restore_in_progress",
            "decision" : "NO",
            "explanation" : "shard has failed to be restored from the snapshot [_ccr_fee8f8dc-ae0d-5408-8123-0e875ec0dbcf_datacenter2:_latest_/_latest_] because of [restore_source[_ccr_fee8f8dc-ae0d-5408-8123-0e875ec0dbcf_datacenter2/_latest_]] - manually close or delete the index [ccr-fdi_problem-global-longer3-m2020.01] in order to retry to restore the snapshot again or use the reroute API to force the allocation of an empty primary shard"
          }
        ]
      },
      {
        "node_id" : "Lex1Z0aXSkWidZQPmhnRxw",
        "node_name" : "10.....200",
        "transport_address" : "10.....200:9300",
        "node_attributes" : {
          "ml.machine_memory" : "32154439680",
          "ml.max_open_jobs" : "20",
          "xpack.installed" : "true",
          "ml.enabled" : "true"
        },
        "node_decision" : "no",
        "weight_ranking" : 2,
        "deciders" : [
          {
            "decider" : "restore_in_progress",
            "decision" : "NO",
            "explanation" : "shard has failed to be restored from the snapshot [_ccr_fee8f8dc-ae0d-5408-8123-0e875ec0dbcf_datacenter2:_latest_/_latest_] because of [restore_source[_ccr_fee8f8dc-ae0d-5408-8123-0e875ec0dbcf_datacenter2/_latest_]] - manually close or delete the index [ccr-...] in order to retry to restore the snapshot again or use the reroute API to force the allocation of an empty primary shard"
          }
        ]
      },
      {
        "node_id" : "ZOpwEdM1RC-MgYgSRXREwQ",
        "node_name" : "10.....167",
        "transport_address" : "10.....167:9300",
        "node_attributes" : {
          "ml.machine_memory" : "32154439680",
          "ml.max_open_jobs" : "20",
          "xpack.installed" : "true",
          "ml.enabled" : "true"
        },
        "node_decision" : "no",
        "weight_ranking" : 3,
        "deciders" : [
          {
            "decider" : "restore_in_progress",
            "decision" : "NO",
            "explanation" : "shard has failed to be restored from the snapshot [_ccr_fee8f8dc-ae0d-5408-8123-0e875ec0dbcf_datacenter2:_latest_/_latest_] because of [restore_source[_ccr_fee8f8dc-ae0d-5408-8123-0e875ec0dbcf_datacenter2/_latest_]] - manually close or delete the index [ccr-...] in order to retry to restore the snapshot again or use the reroute API to force the allocation of an empty primary shard"
          }
        ]
      }
    ]
  }

Are you guys using snapshot/restore for ccr ?
The suggested approach to delete and try it again was my fix as well. But I'm still curious on how we can prevent this in the future.

Hi @kley,

which version of ES are you on? Quite a bit has happened to snapshot/restore in the last year.

And yes, the initial copy of a follower index is established by restoring from a CCR repository that serves files from the leader cluster. So the snapshot/restore infrastructure is used, though no external snapshot is generated in the process.

Ah sorry for not providing information about that.
We are using 6.8.1. There's a ticket for upgrading elasticsearch to 7.x in our backlog but that will not happen the near future.

Hi @kley,

I wanted to try to reproduce this. Can you elaborate on how you disable allocation? Normally we only recommend disabling the allocation of replicas, but I wonder if you in your case disabled allocation completely (i.e. set it to "none" instead of "primaries")?

Ah interesting, that might be the issue here.
We usually set disable "all" allocation.
Will forward that and try it.

Thanks @HenningAndersen for helping out!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.