Unassigned Shards, what to do about it

Moritz_Kiesewetter · July 30, 2019, 12:14pm

Hello Guys,

first off, our Cluster:
Master (Kibana)
-8GB
-Appache Web Server
-Does not filter/query

Workernode#01/#02
-32GB Ram each
-No Kibana

Soo my cluster is running about fine, there's just this thing that the Health under "Monitoring" is Yellow, since i have around 50% unassigned shards. What do i have to do about it? My predecessor set the default Shards to:
-5 Primary Shards
-1 replica
Which is way to much if i understand this right i should go with 2 Primary and 1 replica, but i struggle to find where i can fix this.

Any other Ideas why there are so many unassigned shards?

Greetz
Mo

rugenl · July 30, 2019, 12:30pm

That should work even with 5 shards and 1 replica.

It could be something like node.attr's, like rack awareness. Do you see any node.attr options in elastisearch.yml? If so, what are the settings on both nodes?

Pick one index with unassigned shards, make sure it doesn't have replica = 2 or more.

Moritz_Kiesewetter · July 30, 2019, 12:40pm

Add custom attributes to the node:

#node.attr.rack: r1

----------------------------------- Paths ------------------------------------

Well i guess it's left as default ? It's like this on both sides.

Silen_logs · July 30, 2019, 6:42pm

Hi,

Did you read the logs ? When a shard cant be assign its possible that something has happened to these shards or with the cluster in the allocation moment.

Also, you can see which shards dont have been assigned with the command below:

CLI: curl -XGET "<IP>:9200/_cat/shards" | grep "UNASSIGNED"
DEV Tools: GET /_cat/shards

DavidTurner · July 30, 2019, 6:52pm

There are many reasons why shards might be unassigned, and it is rare for the logs to contain much useful information. The right way to diagnose the reason for an unassigned shard is to use the allocation explain API.

Moritz_Kiesewetter · July 31, 2019, 6:25am

@Silen_logs @DavidTurner First of all, thanks both of you for the hints & tips.

After using the allocation explain API, i got various reasons back why the shards cannot be assigned:

the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [13.226582219832771%] --> I've upgraded the Disk with 250gb more, and set the Watermark rules to 80%.

the shard cannot be allocated to the same node on which a copy of the shard already exists --> i don't really know what to about that? Any clues?

cannot allocate because allocation is not permitted to any of the nodes --> I guess i'll find this in some .yml /conf file to change?

Thanks for your help guys, appreciate it!

Greetz
Moritz

Christian_Dahlqvist · July 31, 2019, 6:30am

Are all your Elasticsearch nodes running exactly the same version?

DavidTurner · July 31, 2019, 6:36am

It's quite hard to help from just the few small messages that you've picked out by hand. Please share the whole output.

That's normal. Elasticsearch won't put more than one copy of a shard on each node.

Ok, add more disk space, delete some data, or adjust the watermark settings. NB cluster.routing.allocation.disk.watermark.low=85% means the limit is when the disk is 85% full, so reducing the watermark to 80% is making the problem worse.

This is a summary of the more detailed information elsewhere in the output.

Moritz_Kiesewetter · July 31, 2019, 6:47am

Ok, here is the whole output..

{
  "index" : "filebeat-6.8.1-2019.07.30",
  "shard" : 1,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "NODE_LEFT",
    "at" : "2019-07-30T11:37:45.122Z",
    "details" : "node_left [LZyAKlAfS-mNxvdjuaEwUg]",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "LZyAKlAfS-mNxvdjuaEwUg",
      "node_name" : "LZyAKlA",
      "transport_address" : "X.X.X.223:9300",
      "node_attributes" : {
        "ml.machine_memory" : "25111109632",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true",
        "ml.enabled" : "true"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[filebeat-6.8.1-2019.07.30][1], node[LZyAKlAfS-mNxvdjuaEwUg], [P], s[STARTED], a[id=bW1xHKZcRPOlDzLLDDEKXA]]"
        },
        {
          "decider" : "disk_threshold",
          "decision" : "NO",
          "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [14.997178982365764%]"
        }
      ]
    },
    {
      "node_id" : "LdEwqP_YRiWJRWA1_UyIgg",
      "node_name" : "LdEwqP_",
      "transport_address" : "X.X.X.222:9300",
      "node_attributes" : {
        "ml.machine_memory" : "25111093248",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true",
        "ml.enabled" : "true"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "disk_threshold",
          "decision" : "NO",
          "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [13.211407213469084%]"
        }
      ]
    }
  ]
}

Moritz_Kiesewetter · July 31, 2019, 6:49am

Yes, on both nodes, it it Version 6.8.1
Node#01
Installed Packages
Name : elasticsearch
Arch : noarch
Version : 6.8.1
Release : 1
Size : 227 M
Repo : installed
From repo : elastic-6.x
Node#02
Installed Packages
Name : elasticsearch
Arch : noarch
Version : 6.8.1
Release : 1
Size : 227 M
Repo : installed
From repo : elastic-6.x

DavidTurner · July 31, 2019, 6:53am

Thanks, that helps. I reformatted it for you using the </> button to make it easier to read.

You have two data nodes and their disks are both over 85% full, so no replicas can be allocated in this cluster.

Moritz_Kiesewetter · July 31, 2019, 7:06am

Sorry about that! Will do </> in the future.
Thanks a lot, i'll upgrade the Disks and hope this Problem will clear out.
Will mark you answer properly by tomorrow!
Thanks again.

Moritz_Kiesewetter · July 31, 2019, 11:22am

Little Update : I noticed that ES had troubles with allocating replica shards, because it thought that exact the same replica would already be on that node. Reassigned the Replicas (long process) and right now i'm at 200 unassigned shards, and it's going down with every index checked. I don't know yet if this is the final solution, since i have not improved my disk space yet.

system · August 28, 2019, 12:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why are some shards unassigned even though there are empty nodes? Elasticsearch	5	7813	January 5, 2017
Unassigned shards Elasticsearch	3	524	July 6, 2017
Unassigned shards in 10 node cluster? Elasticsearch	3	1314	January 24, 2017
Problem with unassigned Shards Elasticsearch	5	551	February 6, 2017
Shards UNASSIGNED even tho they exist on disk Elasticsearch	2	557	July 6, 2017

Unassigned Shards, what to do about it

Add custom attributes to the node:

----------------------------------- Paths ------------------------------------

Related topics