Replica shards of newly created indices remain UNASSIGNED

hello

We haven’t experienced this issue before, but recently we noticed that the primary shards of rollover-created indices are allocated normally, while the replica shards always remain in an UNASSIGNED state. We created a test index to verify this, and the same issue occurred—only the replica shards are not being allocated.

Below is the allocation explain output for the replica shard of the test index:

"index": "test-index-replica",
"shard": 0,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
  "reason": "INDEX_CREATED",
  "at": "2025-06-23T",
  "last_allocation_status": "no_attempt"
},
"can_allocate": "yes",
"allocate_explanation": "Elasticsearch can allocate the shard.",
"target_node": {
  "id": "4haAS8CuT-6xhgA",
  "name": "data5",
  "transport_address": "IP:9300",
  "attributes": {
    "transform.config_version": "10.0.0",
    "xpack.installed": "true",
    "ml.config_version": "12.0.0"
  },
  "roles": [
    "data",
    "data_cold",
    "data_content",
    "data_frozen",
    "data_hot",
    "data_warm",
    "ingest",
    "remote_cluster_client",
    "transform"
  ]
},
"node_allocation_decisions": [
  {
    "node_id": "4haAS8CuT-6xhgA",
    "node_name": "data5",
    "transport_address": "IP:9300",
    "node_attributes": {
      "transform.config_version": "10.0.0",
      "xpack.installed": "true",
      "ml.config_version": "12.0.0"
    },
    "roles": [
      "data",
      "data_cold",
      "data_content",
      "data_frozen",
      "data_hot",
      "data_warm",
      "ingest",
      "remote_cluster_client",
      "transform"
    ],
    "node_decision": "yes",
    "weight_ranking": 3
  }
]

However, if we manually route the replica shard to the target node as shown below, it gets allocated successfully:

POST /_cluster/reroute
{
  "commands": [
    {
      "allocate_replica": {
        "index": "test-index-replica",
        "shard": 0,
        "node": "data5"
      }
    }
  ]
}

There is no disk watermark issue or capacity limitation on the target node.(The node has 1TB of free disk space available.)
Also, there are no ongoing recovery or balancing operations, and there are no pending tasks.
Our Elasticsearch cluster is running version 8.14.3.
We would appreciate any assistance or guidance to resolve this issue.

Thank you.

Hello @onel

Welcome to the community!!

Your elastic version is 8.14.3 , could you please share below information ?

  1. Total number of data nodes in your cluster?
  2. Index settings (primary + replicas) for this index ?

I found one blog related to similar issue :

Thanks!!

Thank you for your response. However, the issue described in the link you provided does not appear to be related to my environment.
("index.routing.allocation.total_shards_per_node": -1)

We have 8 data nodes, and the index was created as follows:

PUT test-index-replica
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "message": {
        "type": "text"
      },
      "timestamp": {
        "type": "date"
      }
    }
  }
}

Can you reproduce this in 8.18.3? I have a hunch this may be something that got fixed in the last year or so since 8.14 was released.

Thank you for your response.

We were not able to reproduce the issue in a test cluster running the same version, so we cannot be certain that upgrading to version 8.18.3 will resolve the problem.

If this issue has been addressed sometime after the release of 8.14, could you please let us know where we can find the release notes or documentation confirming that it has been fixed?

Thank you.

No, sorry, I can't offer enough of my time to do that for you. It's the other way around: you must confirm it is still a problem in 8.18.3 before I can justify spending the time needed to investigate more deeply.

Hello @onel

As David shared maybe this needs to be checked post upgrade if the issue persists.

Still i am curious to know below information if you can share?

  1. For all your indexes in this cluster replicas are not assigned? and this is done manually using POST /_cluster/reroute?
  2. This was working previously & suddenly you are facing this issue?
  3. Any changes made at cluster level settings recently?
  4. If possible share below information :

GET /_cluster/health
GET /test-index-replica/_settings
GET /_cat/allocation?v=true

GET /_cluster/allocation/explain
{
  "index": "test-index-replica",
  "shard": 0,
  "primary": false
}

Please share the value for below setting at cluster level :
"cluster.routing.allocation.enable": "all"

Thanks!!

Thank you for your response. We will proceed with the upgrade to version 8.18.3 and continue to monitor whether the issue persists.

  • No, the existing replicas are working normally. This issue only occurs with newly created indices—either via rollover or manually created for testing purposes. Replica shards remain unassigned unless they are manually allocated, even though they should be allocated automatically.
  • Yes, this issue started occurring suddenly.
  • There have been no configuration changes.
  • If helpful, please find the following information:

GET /_cluster/health

{
  "cluster_name": "cluster",
  "status": "yellow",
  "timed_out": false,
  "number_of_nodes": 12,
  "number_of_data_nodes": 8,
  "active_primary_shards": 440,
  "active_shards": 863,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 7,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number":

GET /test-index-replica/_settings

{
  "test-index-replica": {
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "1",
        "provided_name": "test-index-replica",
        "creation_date": "",
        "number_of_replicas": "1",
        "uuid": "",
        "version": {
          "created": ""
        }
      }
    }
  }
}

The output of GET /_cluster/allocation/explain has already been provided above.
The setting "cluster.routing.allocation.enable" is set to "all".

Hello @onel

Thank you for sharing the details.

As per the details shared I had observed similar situation where after rollover the shards were not allocated & issue was found as there was change in below parameter which has default value as "2.0E-11" , once it was reverted back to default value new shards were allocated without any issue.

"cluster.routing.allocation.balance.disk_usage": "2.0E-11"

Also can try to review the elasticsearch logs if it shows more information which can be helpful to proceed further on troubleshooting this issue.

Thanks!!

Thank you for your response.

It seems that the "cluster.routing.allocation.balance.disk_usage" option is currently set to a default value of "2.0E-11". Could you please clarify what the actual default value of this setting is and what it means?

Also, are you saying that this setting—intended to balance disk usage evenly across nodes—is the reason why replica shards are not being allocated? I'm not quite sure I understand that part.

Hello @onel

The one set is the actual default value.

As it is already default so your issue must be not related to this parameter.

More information related to this parameter
Increasing the value of cluster.routing.allocation.balance.disk_usage will make Elasticsearch more likely to equalize disk usage across nodes. This means that nodes with higher disk usage will be less likely to receive additional shards and shards may be moved to nodes with lower disk usage to achieve a more balanced distribution.

So sometime to balance the disk usage it takes time to allocate the shard.

Only thing is left is to review the elasticsearch logs & see if there are any messages related to this shard which can help to troubleshoot further.

Thanks!!

Hello, the issue was resolved after upgrading to version 8.18.3. Thank you!

2 Likes