Auto_expand_replicas not working on enrich index

Yes, OK. I created the index using an explicit null value for this setting

PUT /testing_auto_expand_replicas
{
  "settings": {
    "auto_expand_replicas": "0-all",
    "index.routing.allocation.include._tier_preference": null
  }
}

and the index is still not replicated

GET /testing_auto_expand_replicas/_settings

{
  "testing_auto_expand_replicas" : {
    "settings" : {
      "index" : {
        "number_of_shards" : "1",
        "auto_expand_replicas" : "0-all",
        "provided_name" : "testing_auto_expand_replicas",
        "max_inner_result_window" : "10000",
        "creation_date" : "1662467313585",
        "number_of_replicas" : "0",
        "uuid" : "LuWoQP9UQjuyZAKh1pet6w",
        "version" : {
          "created" : "7170199"
        }
      }
    }
  }
}

And what is the result of the allocation explain for this test index?

This one: GET _cluster/allocation/explain?include_yes_decisions

Here it is

GET _cluster/allocation/explain?include_yes_decisions
{
  "index": "testing_auto_expand_replicas",
  "shard": 0,
  "primary": true
}

{
  "index" : "testing_auto_expand_replicas",
  "shard" : 0,
  "primary" : true,
  "current_state" : "started",
  "current_node" : {
    "id" : "PS1WfPGWS4KvFDJbACEG4A",
    "name" : "es-01.stordata.dc",
    "transport_address" : "10.254.249.10:9300",
    "attributes" : {
      "ml.machine_memory" : "135075225600",
      "ml.max_open_jobs" : "512",
      "xpack.installed" : "true",
      "ml.max_jvm_size" : "33285996544",
      "transform.node" : "true"
    },
    "weight_ranking" : 1
  },
  "can_remain_on_current_node" : "yes",
  "can_rebalance_cluster" : "yes",
  "can_rebalance_to_other_node" : "no",
  "rebalance_explanation" : "cannot rebalance as no target node exists that can both allocate this shard and improve the cluster balance",
  "node_allocation_decisions" : [
    {
      "node_id" : "NeLqOXVAQa-38GN1jWyn_Q",
      "node_name" : "es-03.stordata.dc",
      "transport_address" : "10.254.249.12:9300",
      "node_attributes" : {
        "ml.machine_memory" : "135075368960",
        "ml.max_open_jobs" : "512",
        "xpack.installed" : "true",
        "ml.max_jvm_size" : "33285996544",
        "transform.node" : "true"
      },
      "node_decision" : "worse_balance",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "YES",
          "explanation" : "shard has no previous failures"
        },
        {
          "decider" : "replica_after_primary_active",
          "decision" : "YES",
          "explanation" : "shard is primary and can be allocated"
        },
        {
          "decider" : "enable",
          "decision" : "YES",
          "explanation" : "all allocations are allowed"
        },
        {
          "decider" : "node_version",
          "decision" : "YES",
          "explanation" : "can relocate primary shard from a node with version [7.17.1] to a node with equal-or-newer version [7.17.1]"
        },
        {
          "decider" : "snapshot_in_progress",
          "decision" : "YES",
          "explanation" : "no snapshots are currently running"
        },
        {
          "decider" : "restore_in_progress",
          "decision" : "YES",
          "explanation" : "ignored as shard is not being recovered from a snapshot"
        },
        {
          "decider" : "node_shutdown",
          "decision" : "YES",
          "explanation" : "this node is not currently shutting down"
        },
        {
          "decider" : "node_replacement",
          "decision" : "YES",
          "explanation" : "neither the source nor target node are part of an ongoing node replacement (no replacements)"
        },
        {
          "decider" : "filter",
          "decision" : "YES",
          "explanation" : "node passes include/exclude/require filters"
        },
        {
          "decider" : "same_shard",
          "decision" : "YES",
          "explanation" : "this node does not hold a copy of this shard"
        },
        {
          "decider" : "disk_threshold",
          "decision" : "YES",
          "explanation" : "enough disk for shard on node, free: [1.3tb], shard size: [226b], free after allocating shard: [1.3tb]"
        },
        {
          "decider" : "throttling",
          "decision" : "YES",
          "explanation" : "below shard recovery limit of outgoing: [0 < 4] incoming: [0 < 4]"
        },
        {
          "decider" : "shards_limit",
          "decision" : "YES",
          "explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
        },
        {
          "decider" : "awareness",
          "decision" : "YES",
          "explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
        },
        {
          "decider" : "data_tier",
          "decision" : "YES",
          "explanation" : "node passes include/exclude/require/prefer tier filters"
        },
        {
          "decider" : "ccr_primary_follower",
          "decision" : "YES",
          "explanation" : "shard is not a follower and is not under the purview of this decider"
        },
        {
          "decider" : "searchable_snapshots",
          "decision" : "YES",
          "explanation" : "decider only applicable for indices backed by searchable snapshots"
        },
        {
          "decider" : "searchable_snapshot_repository_exists",
          "decision" : "YES",
          "explanation" : "this decider only applies to indices backed by searchable snapshots"
        },
        {
          "decider" : "searchable_snapshots_enable",
          "decision" : "YES",
          "explanation" : "decider only applicable for indices backed by searchable snapshots"
        },
        {
          "decider" : "dedicated_frozen_node",
          "decision" : "YES",
          "explanation" : "this node's data roles are not exactly [data_frozen] so it is not a dedicated frozen node"
        }
      ]
    },
    {
      "node_id" : "qZMbs-d_QcOGh3Ve2DS1ug",
      "node_name" : "es-02.stordata.dc",
      "transport_address" : "10.254.249.11:9300",
      "node_attributes" : {
        "ml.machine_memory" : "135075360768",
        "ml.max_open_jobs" : "512",
        "xpack.installed" : "true",
        "ml.max_jvm_size" : "33285996544",
        "transform.node" : "true"
      },
      "node_decision" : "worse_balance",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "YES",
          "explanation" : "shard has no previous failures"
        },
        {
          "decider" : "replica_after_primary_active",
          "decision" : "YES",
          "explanation" : "shard is primary and can be allocated"
        },
        {
          "decider" : "enable",
          "decision" : "YES",
          "explanation" : "all allocations are allowed"
        },
        {
          "decider" : "node_version",
          "decision" : "YES",
          "explanation" : "can relocate primary shard from a node with version [7.17.1] to a node with equal-or-newer version [7.17.1]"
        },
        {
          "decider" : "snapshot_in_progress",
          "decision" : "YES",
          "explanation" : "no snapshots are currently running"
        },
        {
          "decider" : "restore_in_progress",
          "decision" : "YES",
          "explanation" : "ignored as shard is not being recovered from a snapshot"
        },
        {
          "decider" : "node_shutdown",
          "decision" : "YES",
          "explanation" : "node [qZMbs-d_QcOGh3Ve2DS1ug] is preparing to restart, but will remain in the cluster"
        },
        {
          "decider" : "node_replacement",
          "decision" : "YES",
          "explanation" : "neither the source nor target node are part of an ongoing node replacement (no replacements)"
        },
        {
          "decider" : "filter",
          "decision" : "YES",
          "explanation" : "node passes include/exclude/require filters"
        },
        {
          "decider" : "same_shard",
          "decision" : "YES",
          "explanation" : "this node does not hold a copy of this shard"
        },
        {
          "decider" : "disk_threshold",
          "decision" : "YES",
          "explanation" : "enough disk for shard on node, free: [1.3tb], shard size: [226b], free after allocating shard: [1.3tb]"
        },
        {
          "decider" : "throttling",
          "decision" : "YES",
          "explanation" : "below shard recovery limit of outgoing: [0 < 4] incoming: [0 < 4]"
        },
        {
          "decider" : "shards_limit",
          "decision" : "YES",
          "explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
        },
        {
          "decider" : "awareness",
          "decision" : "YES",
          "explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
        },
        {
          "decider" : "data_tier",
          "decision" : "YES",
          "explanation" : "node passes include/exclude/require/prefer tier filters"
        },
        {
          "decider" : "ccr_primary_follower",
          "decision" : "YES",
          "explanation" : "shard is not a follower and is not under the purview of this decider"
        },
        {
          "decider" : "searchable_snapshots",
          "decision" : "YES",
          "explanation" : "decider only applicable for indices backed by searchable snapshots"
        },
        {
          "decider" : "searchable_snapshot_repository_exists",
          "decision" : "YES",
          "explanation" : "this decider only applies to indices backed by searchable snapshots"
        },
        {
          "decider" : "searchable_snapshots_enable",
          "decision" : "YES",
          "explanation" : "decider only applicable for indices backed by searchable snapshots"
        },
        {
          "decider" : "dedicated_frozen_node",
          "decision" : "YES",
          "explanation" : "this node's data roles are not exactly [data_frozen] so it is not a dedicated frozen node"
        }
      ]
    }
  ]
}

Can you try to set this to a higher value as well? I believe auto_expand_replicas should override this setting, but maybe that has changed?

Sure thing. The number_of_replicas setting is immediately overwritten to 0 in this case

PUT /testing_auto_expand_replicas
{
  "settings": {
    "auto_expand_replicas": "0-all",
    "index.routing.allocation.include._tier_preference": null,
    "number_of_replicas": 2
  }
}
GET /testing_auto_expand_replicas/_settings

{
  "testing_auto_expand_replicas" : {
    "settings" : {
      "index" : {
        "number_of_shards" : "1",
        "auto_expand_replicas" : "0-all",
        "provided_name" : "testing_auto_expand_replicas",
        "max_inner_result_window" : "10000",
        "creation_date" : "1662468286411",
        "number_of_replicas" : "0",
        "uuid" : "Ed1l4I8aS7ywEI0BNpuNwA",
        "version" : {
          "created" : "7170199"
        }
      }
    }
  }
}

And the allocation explain, in case you need it

{
  "index" : "testing_auto_expand_replicas",
  "shard" : 0,
  "primary" : true,
  "current_state" : "started",
  "current_node" : {
    "id" : "PS1WfPGWS4KvFDJbACEG4A",
    "name" : "es-01.stordata.dc",
    "transport_address" : "10.254.249.10:9300",
    "attributes" : {
      "ml.machine_memory" : "135075225600",
      "ml.max_open_jobs" : "512",
      "xpack.installed" : "true",
      "ml.max_jvm_size" : "33285996544",
      "transform.node" : "true"
    },
    "weight_ranking" : 1
  },
  "can_remain_on_current_node" : "yes",
  "can_rebalance_cluster" : "yes",
  "can_rebalance_to_other_node" : "no",
  "rebalance_explanation" : "cannot rebalance as no target node exists that can both allocate this shard and improve the cluster balance",
  "node_allocation_decisions" : [
    {
      "node_id" : "NeLqOXVAQa-38GN1jWyn_Q",
      "node_name" : "es-03.stordata.dc",
      "transport_address" : "10.254.249.12:9300",
      "node_attributes" : {
        "ml.machine_memory" : "135075368960",
        "ml.max_open_jobs" : "512",
        "xpack.installed" : "true",
        "ml.max_jvm_size" : "33285996544",
        "transform.node" : "true"
      },
      "node_decision" : "worse_balance",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "YES",
          "explanation" : "shard has no previous failures"
        },
        {
          "decider" : "replica_after_primary_active",
          "decision" : "YES",
          "explanation" : "shard is primary and can be allocated"
        },
        {
          "decider" : "enable",
          "decision" : "YES",
          "explanation" : "all allocations are allowed"
        },
        {
          "decider" : "node_version",
          "decision" : "YES",
          "explanation" : "can relocate primary shard from a node with version [7.17.1] to a node with equal-or-newer version [7.17.1]"
        },
        {
          "decider" : "snapshot_in_progress",
          "decision" : "YES",
          "explanation" : "no snapshots are currently running"
        },
        {
          "decider" : "restore_in_progress",
          "decision" : "YES",
          "explanation" : "ignored as shard is not being recovered from a snapshot"
        },
        {
          "decider" : "node_shutdown",
          "decision" : "YES",
          "explanation" : "this node is not currently shutting down"
        },
        {
          "decider" : "node_replacement",
          "decision" : "YES",
          "explanation" : "neither the source nor target node are part of an ongoing node replacement (no replacements)"
        },
        {
          "decider" : "filter",
          "decision" : "YES",
          "explanation" : "node passes include/exclude/require filters"
        },
        {
          "decider" : "same_shard",
          "decision" : "YES",
          "explanation" : "this node does not hold a copy of this shard"
        },
        {
          "decider" : "disk_threshold",
          "decision" : "YES",
          "explanation" : "enough disk for shard on node, free: [1.3tb], shard size: [226b], free after allocating shard: [1.3tb]"
        },
        {
          "decider" : "throttling",
          "decision" : "YES",
          "explanation" : "below shard recovery limit of outgoing: [0 < 4] incoming: [0 < 4]"
        },
        {
          "decider" : "shards_limit",
          "decision" : "YES",
          "explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
        },
        {
          "decider" : "awareness",
          "decision" : "YES",
          "explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
        },
        {
          "decider" : "data_tier",
          "decision" : "YES",
          "explanation" : "node passes include/exclude/require/prefer tier filters"
        },
        {
          "decider" : "ccr_primary_follower",
          "decision" : "YES",
          "explanation" : "shard is not a follower and is not under the purview of this decider"
        },
        {
          "decider" : "searchable_snapshots",
          "decision" : "YES",
          "explanation" : "decider only applicable for indices backed by searchable snapshots"
        },
        {
          "decider" : "searchable_snapshot_repository_exists",
          "decision" : "YES",
          "explanation" : "this decider only applies to indices backed by searchable snapshots"
        },
        {
          "decider" : "searchable_snapshots_enable",
          "decision" : "YES",
          "explanation" : "decider only applicable for indices backed by searchable snapshots"
        },
        {
          "decider" : "dedicated_frozen_node",
          "decision" : "YES",
          "explanation" : "this node's data roles are not exactly [data_frozen] so it is not a dedicated frozen node"
        }
      ]
    },
    {
      "node_id" : "qZMbs-d_QcOGh3Ve2DS1ug",
      "node_name" : "es-02.stordata.dc",
      "transport_address" : "10.254.249.11:9300",
      "node_attributes" : {
        "ml.machine_memory" : "135075360768",
        "ml.max_open_jobs" : "512",
        "xpack.installed" : "true",
        "ml.max_jvm_size" : "33285996544",
        "transform.node" : "true"
      },
      "node_decision" : "worse_balance",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "YES",
          "explanation" : "shard has no previous failures"
        },
        {
          "decider" : "replica_after_primary_active",
          "decision" : "YES",
          "explanation" : "shard is primary and can be allocated"
        },
        {
          "decider" : "enable",
          "decision" : "YES",
          "explanation" : "all allocations are allowed"
        },
        {
          "decider" : "node_version",
          "decision" : "YES",
          "explanation" : "can relocate primary shard from a node with version [7.17.1] to a node with equal-or-newer version [7.17.1]"
        },
        {
          "decider" : "snapshot_in_progress",
          "decision" : "YES",
          "explanation" : "no snapshots are currently running"
        },
        {
          "decider" : "restore_in_progress",
          "decision" : "YES",
          "explanation" : "ignored as shard is not being recovered from a snapshot"
        },
        {
          "decider" : "node_shutdown",
          "decision" : "YES",
          "explanation" : "node [qZMbs-d_QcOGh3Ve2DS1ug] is preparing to restart, but will remain in the cluster"
        },
        {
          "decider" : "node_replacement",
          "decision" : "YES",
          "explanation" : "neither the source nor target node are part of an ongoing node replacement (no replacements)"
        },
        {
          "decider" : "filter",
          "decision" : "YES",
          "explanation" : "node passes include/exclude/require filters"
        },
        {
          "decider" : "same_shard",
          "decision" : "YES",
          "explanation" : "this node does not hold a copy of this shard"
        },
        {
          "decider" : "disk_threshold",
          "decision" : "YES",
          "explanation" : "enough disk for shard on node, free: [1.3tb], shard size: [226b], free after allocating shard: [1.3tb]"
        },
        {
          "decider" : "throttling",
          "decision" : "YES",
          "explanation" : "below shard recovery limit of outgoing: [0 < 4] incoming: [0 < 4]"
        },
        {
          "decider" : "shards_limit",
          "decision" : "YES",
          "explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
        },
        {
          "decider" : "awareness",
          "decision" : "YES",
          "explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
        },
        {
          "decider" : "data_tier",
          "decision" : "YES",
          "explanation" : "node passes include/exclude/require/prefer tier filters"
        },
        {
          "decider" : "ccr_primary_follower",
          "decision" : "YES",
          "explanation" : "shard is not a follower and is not under the purview of this decider"
        },
        {
          "decider" : "searchable_snapshots",
          "decision" : "YES",
          "explanation" : "decider only applicable for indices backed by searchable snapshots"
        },
        {
          "decider" : "searchable_snapshot_repository_exists",
          "decision" : "YES",
          "explanation" : "this decider only applies to indices backed by searchable snapshots"
        },
        {
          "decider" : "searchable_snapshots_enable",
          "decision" : "YES",
          "explanation" : "decider only applicable for indices backed by searchable snapshots"
        },
        {
          "decider" : "dedicated_frozen_node",
          "decision" : "YES",
          "explanation" : "this node's data roles are not exactly [data_frozen] so it is not a dedicated frozen node"
        }
      ]
    }
  ]
}

Sounds like a bug to me then.

OK, sad. I'm opening an issue on Github then...
Do you think restarting the nodes might help ?

It would be interesting to see if someone can reproduce it.

I've opened Indices with auto_expand_replicas are not replicated · Issue #89823 · elastic/elasticsearch · GitHub

Assume you tested (just as a data point) that plain old 2 replicas worked without auto expand

Yes. All of our indices use a static number of replicas (either 1 or 2) and this works well. It's really when auto_expand_replicas enters the game.

1 Like

Guys, I've resolved the issue, though I believe there's still something strange in the behavior of Elasticsearch. Let me explain, hopefully it might help others coming around here!

So the root cause is that we rolling-restarted the cluster weeks ago in order to change the underlying nodes' storage, and we did this using the shutdown API. Incidentally, we forgot to clear the shutdown requests in order to "resume normal operation" as per the documentation... Right after the DELETE calls to clear the requests the system indices started to replicate on the two other nodes.

What I find strange is that regular indices were replicated correctly, only auto_expand_replicas was not working. If the nodes, once they rejoined the cluster after the restart, were not able to take replica shards because of the uncleared shutdown requests, I would have expected a more general problem. Also all allocation deciders are replying with a "YES" decision...

I'll comment on the Github issue, I think we can follow-up there.

Thanks everyone for your support.

Cheers,
David

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.