Replacing a cluster node

rihad · January 17, 2024, 10:16am

Hi, we have a 3 node cluster, with 3 ME nodes. Today I needed to replace one node with another on a new server. I simply shut down ES on the old server, and started ES on the new server. It did join the cluster according to _cat/nodes, but no data/shards started being replicated to it. Cluster health remained yellow up until 2 previous nodes relocated all missing data to themselves. Now I have 2 nodes each having the full dataset (according to df -h), and the new node which doesn't have any index data. What's going on and how do I let it join the cluster normally and copy some of the data to itself? Thanks.

ES version 7.17.11

DavidTurner · January 17, 2024, 10:58am

It's best to use the cluster allocation explain API to explain the allocation of shards. If you need help understanding its output, please feel free to share it here.

intrepid1 · January 17, 2024, 11:15am

The cluster allocation explain will certainly tell you why shards haven't moved to your new node. I have this sometimes and the allocation explain really helps and gives you a great starting point as to where to go next.

It would be interesting to see your output.

You may also want to to look at running cluster/_reroute command.

Cluster reroute API | Elasticsearch Guide [8.11] | Elastic

The cluster will attempt to allocate a shard a maximum of index.allocation.max_retries times in a row (defaults to 5 ), before giving up and leaving the shard unallocated.

I believe the reroute command restarts the process.

DavidTurner · January 17, 2024, 11:21am

I don't think this is great advice. If a call to this API is needed, the allocation explain API will tell you about it. If it's not needed then it's best not to call it.

DavidTurner · January 17, 2024, 11:54am

Sorry, I meant to add that I agree with the rest of the message from @intrepid1 Allocation explain really is a useful API in this context. It's unlikely to relate to index.allocation.max_retries in this situation since it sounds like there are no unassigned shards, but this limit is still useful to know about.

rihad · January 17, 2024, 12:13pm

Thanks for the tips! Here's the explanation output:

First server is the one having no data, second & third - each having the full dataset.

[rihad@eldey ~]$ curl -X GET "amiata.local:9200/_cluster/allocation/explain?pretty" -H 'Content-Type: application/json' -d'
{
  "index": "my-index",
  "shard": 0,
  "primary": false,
  "current_node": "amiata.example.com"
}
'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "unable to find a replica shard assigned to node [amiata.example.com]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "unable to find a replica shard assigned to node [amiata.example.com]"
  },
  "status" : 400
}
[rihad@eldey ~]$ curl -X GET "eldey.local:9200/_cluster/allocation/explain?pretty" -H 'Content-Type: application/json' -d'
{
  "index": "my-index",
  "shard": 0,
  "primary": false,
  "current_node": "eldey.example.com"
}
'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "unable to find a replica shard assigned to node [eldey.example.com]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "unable to find a replica shard assigned to node [eldey.example.com]"
  },
  "status" : 400
}
[rihad@eldey ~]$ curl -X GET "pico.local:9200/_cluster/allocation/explain?pretty" -H 'Content-Type: application/json' -d'
{
  "index": "my-index",
  "shard": 0,
  "primary": false,
  "current_node": "pico.example.com"
}
'
{
  "index" : "my-index",
  "shard" : 0,
  "primary" : false,
  "current_state" : "started",
  "current_node" : {
    "id" : "qj963ExLRI-bceFokseNTQ",
    "name" : "pico.example.com",
    "transport_address" : "172.16.1.11:9300",
    "attributes" : {
      "xpack.installed" : "true",
      "transform.node" : "true"
    },
    "weight_ranking" : 1
  },
  "can_remain_on_current_node" : "yes",
  "can_rebalance_cluster" : "yes",
  "can_rebalance_to_other_node" : "no",
  "rebalance_explanation" : "cannot rebalance as no target node exists that can both allocate this shard and improve the cluster balance",
  "node_allocation_decisions" : [
    {
      "node_id" : "jxM3gIeDQAu8ex9ghdjshg",
      "node_name" : "eldey.example.com",
      "transport_address" : "172.16.1.8:9300",
      "node_attributes" : {
        "xpack.installed" : "true",
        "transform.node" : "true"
      },
      "node_decision" : "no",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "a copy of this shard is already allocated to this node [[my-index][0], node[jxM3gIeDQAu8ex9ghdjshg], [P], s[STARTED], a[id=gh3V_k79Rg6KW2OP02ijTw]]"
        }
      ]
    },
    {
      "node_id" : "pXeE9Ij_T-64jFUDMhJ34w",
      "node_name" : "amiata.example.com",
      "transport_address" : "172.16.1.6:9300",
      "node_attributes" : {
        "xpack.installed" : "true",
        "transform.node" : "true"
      },
      "node_decision" : "worse_balance",
      "weight_ranking" : 1
    }
  ]
}

DavidTurner · January 17, 2024, 12:37pm

rihad:

{
  "node_id" : "pXeE9Ij_T-64jFUDMhJ34w",
  "node_name" : "amiata.example.com",
  "transport_address" : "172.16.1.6:9300",
  "node_attributes" : {
    "xpack.installed" : "true",
    "transform.node" : "true"
  },
  "node_decision" : "worse_balance",
  "weight_ranking" : 1
}

Hmm that says that moving the shard to this node would make the cluster more imbalanced. What does GET _cat/allocation return?

rihad · January 17, 2024, 12:56pm

[rihad@eldey ~]$ curl pico.local:9200/_cat/allocation?v
shards disk.indices disk.used disk.avail disk.total disk.percent host        ip          node
     0           0b     284kb      1.3tb      1.3tb            0 172.16.1.6  172.16.1.6  amiata.example.com
     1       23.6gb    23.4gb      1.6tb      1.6tb            1 172.16.1.8  172.16.1.8  eldey.example.com
     1       23.6gb    23.3gb      1.6tb      1.6tb            1 172.16.1.11 172.16.1.11 pico.example.com

DavidTurner · January 17, 2024, 1:28pm

Ok you only have 2 shards, with three nodes there's always going to be one node with zero shards.

rihad · January 17, 2024, 1:44pm

Oh, indeed, another cluster that hasn't had node replacement also exhibits this behavior with 2 shards:

shards disk.indices disk.used disk.avail disk.total disk.percent host        ip          node
     0           0b     228kb      1.6tb      1.6tb            0 172.16.1.21 172.16.1.21 uranus.local
     1        4.3gb     4.2gb      1.6tb      1.6tb            0 172.16.1.25 172.16.1.25 sun.local
     1        4.4gb     4.2gb    766.2gb    770.5gb            0 172.16.1.18 172.16.1.18 moon.local

Yet another with 10 shards has this:

shards disk.indices disk.used disk.avail disk.total disk.percent host        ip          node
     3        4.4gb     4.1gb      3.2tb      3.2tb            0 172.16.1.23 172.16.1.23 camille.local
     4        5.9gb     5.5gb      690gb    695.6gb            0 172.16.1.24 172.16.1.24 carol.local
     3        4.5gb     4.2gb      3.2tb      3.2tb            0 172.16.1.22 172.16.1.22 bahamas.local

So this is normal, so to speak. The number of shards is auto-regulated. I believe tweaking it manually has nothing to do with resilience, but more with performance? A way to improve resilience would be to add more nodes, like 5 nodes would allow up to 2 nodes to be lost.

system · February 14, 2024, 1:45pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Replace an ES node Elasticsearch	7	36	August 29, 2024
Fundamental question about ES data/shards Elasticsearch	3	417	July 6, 2017
I have issues with my cluster and I want to re-build it. Need advices! Elasticsearch	5	442	October 4, 2019
No Data move on New node? Elasticsearch	3	387	July 5, 2017
Changed node port - now it doesn't have shards, but storage is still used Elasticsearch	1	272	July 14, 2022

Replacing a cluster node

Related topics