Unable to decommission nodes from cluster


#1

Hi all,

I was running a cluster with 3 nodes, then I decided to replace these 3 nodes because they were under utilised.
So I added 3 fresh nodes with lower performance (especially disk space).

After the cluster stabilised, I fired the decommission API for a node, but it didn't do anything.

$ curl -X PUT  http://master.node:9200/_cluster/settings -d '{
  "transient" :{
      "cluster.routing.allocation.exclude._ip" : "a.b.c.d"
   }
}'
The node is unaffected, it's shards are not getting relocated. I crosschecked with get API and it shows -
$ curl -X GET http://master.node:9200/_cluster/settings
Output
{
    "persistent": {
        "cluster": {
            "routing": {
                "allocation": {
                    "enable": "all",
                    "disable_allocation": "false"
                }
            }
        }
    },
    "transient": {
        "cluster": {
            "routing": {
                "allocation": {
                    "exclude": {
                        "_ip": "a.b.c.d"
                    }
                }
            }
        }
    }
}
What am I missing here, why this node is not getting decommissioned?

(Mark Walkom) #2

Can you try using hostnames?


#3

Tried using hostname, and it works! :smiley:
Thanks a bunch!
Can you please explain why it wasn't working with IP?


(Mark Walkom) #4

Maybe ES wasn't bound to that IP?


#5

Just checked, node information shows "IP" as "127.0.0.1".

{
  "cluster_name": "cluster.name",
  "nodes": {
    "node_id": {
      "name": "node.name",
      "transport_address": "inet[/10.0.0.xx:9300]",
      "host": "localhost",
      "ip": "127.0.0.1",
      "version": "1.4.0",
      "build": "bc94bd8",
      "http_address": "inet[/10.0.0.xx:9200]",
      "settings": {
        "index": { ....
  ....

(system) #6