Hot shards get allocated to cold nodes

Matthew_Field · July 9, 2021, 8:46am

Hi, I setup ILM, but was surprised to find that upon index creation my index shards were allocated to both hot and cold nodes.

I am on version 7.13.2
I have 3 hot nodes and 2 cold nodes

10.2.6.4 46 88 2 0.25 0.35 0.41 dhimr - elastic-es-master-0
10.2.4.2 23 34 3 0.08 0.27 0.33 cd    -   elastic-es-cold-1
10.2.7.2 22 56 3 0.11 0.35 0.52 cd    - elastic-es-cold-0
10.2.2.3 52 56 5 0.25 0.23 0.14 dhimr - elastic-es-master-1
10.2.4.3 78 62 3 0.08 0.27 0.33 dhimr * elastic-es-master-2

I have also tried to specifically add the settings to the index

{
  "twitter_tweets_prod-000001" : {
    "settings" : {
      "index" : {
        "lifecycle" : {
          "name" : "twitter_tweets_prod_policy",
          "rollover_alias" : "twitter_tweets_prod"
        },
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier" : "data_hot",
              "_tier_preference" : "data_hot"
            }
          }
        },
        "refresh_interval" : "60s",
        "number_of_shards" : "4",
        "provided_name" : "twitter_tweets_prod-000001",
        "creation_date" : "1625755381203",
        "priority" : "100",
        "number_of_replicas" : "0",
        "uuid" : "cB3uGYHZRA6YJ5Xw5aEoyA",
        "version" : {
          "created" : "7130299"
        }
      }
    }
  }
}

And finally in desperation i tried to manually reroute the shards:

POST /_cluster/reroute
{
  "commands": [
    {
      "move": {
        "index": "twitter_tweets_prod-000001", "shard": 1,
        "from_node": "aifeed-elastic-es-cold-1", "to_node": "aifeed-elastic-es-master-0"
      }
    }

  ]
}

This moves the shard to the hot node... but Elasticsearch compensates by moving another shard of the same index back to the cold node.

If i run allocation explain, i get

#! [index.routing.allocation.include._tier] setting was deprecated in Elasticsearch and will be removed in a future release! See the breaking changes documentation for the next major version.
{
  "index" : "twitter_tweets_prod-000001",
  "shard" : 3,
  "primary" : true,
  "current_state" : "started",
  "current_node" : {
    "id" : "y0edQjF1SsqXQaNQnvl5iw",
    "name" : "aifeed-elastic-es-cold-1",
    "transport_address" : "10.2.4.2:9300",
    "attributes" : {
      "k8s_node_name" : "gke-aifeed-k8s-res-pool-v3-3153a58d-q096",
      "xpack.installed" : "true",
      "transform.node" : "false"
    },
    "weight_ranking" : 1
  },
  "can_remain_on_current_node" : "yes",
  "can_rebalance_cluster" : "yes",
  "can_rebalance_to_other_node" : "no",
  "rebalance_explanation" : "cannot rebalance as no target node exists that can both allocate this shard and improve the cluster balance",
  "node_allocation_decisions" : [
    {
      "node_id" : "ljHWnj2yR5ijml1isBiRhA",
      "node_name" : "aifeed-elastic-es-cold-0",
      "transport_address" : "10.2.7.2:9300",
      "node_attributes" : {
        "k8s_node_name" : "gke-aifeed-k8s-res-pool-v3-3153a58d-mz08",
        "xpack.installed" : "true",
        "transform.node" : "false"
      },
      "node_decision" : "worse_balance",
      "weight_ranking" : 1
    },
    {
      "node_id" : "oLhz15CJTiSgs1w71diwmg",
      "node_name" : "aifeed-elastic-es-master-0",
      "transport_address" : "10.2.6.4:9300",
      "node_attributes" : {
        "k8s_node_name" : "gke-aifeed-k8s-res-pool-v3-3153a58d-n8bb",
        "xpack.installed" : "true",
        "transform.node" : "false"
      },
      "node_decision" : "worse_balance",
      "weight_ranking" : 1
    },
    {
      "node_id" : "zDwBnN7wTDGfd9nxoEofSA",
      "node_name" : "aifeed-elastic-es-master-2",
      "transport_address" : "10.2.4.3:9300",
      "node_attributes" : {
        "k8s_node_name" : "gke-aifeed-k8s-res-pool-v3-3153a58d-q096",
        "xpack.installed" : "true",
        "transform.node" : "false"
      },
      "node_decision" : "worse_balance",
      "weight_ranking" : 2
    },
    {
      "node_id" : "Xe9FIwzxRLuDj-mvKF8Sdg",
      "node_name" : "aifeed-elastic-es-master-1",
      "transport_address" : "10.2.2.3:9300",
      "node_attributes" : {
        "k8s_node_name" : "gke-aifeed-k8s-res-pool-v3-3153a58d-34ui",
        "xpack.installed" : "true",
        "transform.node" : "false"
      },
      "node_decision" : "worse_balance",
      "weight_ranking" : 3
    }
  ]
}

So... what am i missing? The allocation explain seems to completely ignore my settings that "prefer" the hot tier, and just want to spread all shards equally accross all the nodes.
Thanks in advance for any help.

Matthew_Field · July 13, 2021, 11:37am

To answer my own question: I found that the cluster was misconfigured. My cold nodes had

node.roles: [ data, data_cold ]

So the hot shards were allowed to allocate based on the "data" role which allows all shards.
I removed "data" role , and the problem was solved.

system · August 10, 2021, 11:38am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Hot / Cold replication shards allocation failed - Elasticsearch 7.4.0 Elasticsearch ilm-index-lifecycle-management	5	3021	June 11, 2020
Elastic shards and replica Elasticsearch ilm-index-lifecycle-management	6	1073	June 7, 2021
How works Allocation shards data tiers recommanded Elasticsearch ilm-index-lifecycle-management	1	165	June 9, 2023
Elastic ignoring cold_role Elasticsearch ilm-index-lifecycle-management	10	386	June 7, 2021
ILM shards to be allocated to nodes matching the given filters Elasticsearch ilm-index-lifecycle-management	4	1350	November 15, 2021

Hot shards get allocated to cold nodes

Related topics