Added a data node, replica shards from .scripts index unassigned, cluster won't rebalance


I added a new data node to our cluster (all nodes running ES 2.4.1), and after the node has successfully joined the cluster, the cluster state is yellow because 5 replica shards from the .scripts index are unassigned (they should be on this new data node).

GET .scripts/settings shows this:

  ".scripts": {
    "settings": {
      "index": {
        "number_of_shards": "5",
        "auto_expand_replicas": "0-all",
        "creation_date": "1461180578329",
        "unassigned": {
          "node_left": {
            "delayed_timeout": "10m"
        "number_of_replicas": "10",
        "uuid": "lclh6JI_QsGUJxoYfL2N6g",
        "version": {
          "created": "2030199"

Looking at _cat/shards, I see one of these for each of the 5 replica shards that are unassigned:


And kopf shows the new node not getting any shards replicated to it (highlighted in red here):

The logs from the master server from when the new data node joined are:

[2017-02-03 12:02:30,917][INFO ][cluster.service          ] [] added {{}{XXXXXXXXX}{XXX.XXX.XXX.XXX}{XXX.XXX.XXX.XXX:9300}{max_local_storage_nodes=1, aws_availability_zone=XXXXXX, tag=current, master=false},}, reason: zen-disco-join(join from node[{}{XXXXXXXXXXX}{XXX.XXX.XXX.XXX}{XXX.XXX.XXX.XXX:9300}{max_local_storage_nodes=1, aws_availability_zone=XXXXXXXX, tag=current, master=false}])
[2017-02-03 12:03:00,927][WARN ][discovery.zen.publish    ] [] timed out waiting for all nodes to process published state [737391] (timeout [30s], pending nodes: [{}{XXXXXXXXX}{XXX.XXX.XXX.XXX}{XXX-XXX.XXX.XXX:9300}{max_local_storage_nodes=1, aws_availability_zone=XXXXXX, tag=current, master=false}])
[2017-02-03 12:03:00,934][WARN ][cluster.service          ] [] cluster state update task [zen-disco-join(join from node[{}{XXXXXXXX}{XXX.XXX.XXX.XXX}{XXX.XXX.XXX.XXX:9300}{max_local_storage_nodes=1, aws_availability_zone=XXXXXXX, tag=current, master=false}])] took 30s above the warn threshold of 30s
[2017-02-03 12:03:00,935][INFO ][cluster.metadata         ] [] updating number_of_replicas to [10] for indices [.scripts]
[2017-02-03 12:03:00,952][INFO ][cluster.metadata         ] [] [.scripts] auto expanded replicas to [10]
[2017-02-03 12:03:30,953][WARN ][discovery.zen.publish    ] [] timed out waiting for all nodes to process published state [737392] (timeout [30s], pending nodes: [{}{XXXXXXXX}{XXX.XXX.XXX.XXX}{XXX.XXX.XXX.XXX:9300}{max_local_storage_nodes=1, aws_availability_zone=XXXXX, tag=current, master=false}])
[2017-02-03 12:03:31,060][WARN ][cluster.service          ] [] cluster state update task [update-settings] took 30.1s above the warn threshold of 30s

I tried restarting the new data node, closing/re-opening the .scripts index, nothing worked.

How do I get out of this state?

Thanks in advance

1 Like

Possibly related?

When I try to manually re-route one of the shards in questions I get this error:

POST _cluster/reroute
    "commands" : [
          "allocate" : {
              "index" : ".scripts", "shard" : 0, "node" : "XXXXXXXXXXX"

   "error": {
      "root_cause": [
            "type": "remote_transport_exception",
            "reason": "[][XXX.XXX.XXX.XXX:9300][cluster:admin/reroute]"
      "type": "illegal_argument_exception",
      "reason": "[allocate] allocation of [.scripts][0] on node {}{XXXXXXXXXXX}{XXX.XXX.XXX.XXX}{XXX.XXX.XXX.XXX:9300}{max_local_storage_nodes=1, aws_availability_zone=XXXX, tag=current, master=false} is not allowed, reason: [YES(below shard recovery limit of [5])][YES(allocation disabling is ignored)][YES(shard not primary or relocation disabled)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)]

***** THIS *****
[NO(too many shards on node for attribute: [aws_availability_zone], required per attribute: [2], node count: [4], leftover: [1])]
***** THIS *****

[YES(allocation disabling is ignored)][YES(enough disk for shard on node, free: [956.5gb])][YES(target node version [2.4.1] is same or newer than source node version [2.4.1])][YES(node passes include/exclude/require filters)][YES(primary is already active)][YES(shard is not allocated to same node or host)]"
   "status": 400

Moving this new data node to another AZ where there was only one other data node made this problem go away. I don't see really understand how "auto_expand_replicas": "0-all", is supposed to work with cluster.routing.allocation.awareness.attributes: aws_availability_zone set.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.