Why are some shards unassigned even though there are empty nodes?


#1

Hello

Looking today at my elasticsearch cluster (2.4.2) I found out that there are unallocated shards. There seems to be a pattern:

enter image description here

It looks like the shards which are on eu2 are not allocated anywhere else (namely neither on eu3nor on eu4).

What can be the reason for that?

I did not fine-tune any allocation strategies, the machines are simply in a cluster with default settings and I do not see any error or warning in the logs (/var/log/elasticsearch/<cluster name>.log). While I did not look at the status for some time, it used to be green (with the same configuration).

I tried to look at the reason for the lack of assignment:

[root:~]# curl "http://localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason"| grep UNASSIGNED
veaees1.html                 1 r UNASSIGNED NODE_LEFT
veaees1.html                 0 r UNASSIGNED NODE_LEFT
eangvn1.html                 1 r UNASSIGNED NODE_LEFT
eangvn1.html                 0 r UNASSIGNED NODE_LEFT
wnoaog1.html                 0 r UNASSIGNED NODE_LEFT
logstash-2015.01.28          0 r UNASSIGNED NODE_LEFT
logstash-2015.01.29          0 r UNASSIGNED NODE_LEFT
spsnwr1.html                 1 r UNASSIGNED NODE_LEFT
spsnwr1.html                 0 r UNASSIGNED NODE_LEFT
flex2gateway                 0 r UNASSIGNED NODE_LEFT
logstash-2015.01.30          1 r UNASSIGNED NODE_LEFT
pnopsg1.html                 0 r UNASSIGNED NODE_LEFT
nessus-candidates-2016-12-05 2 r UNASSIGNED INDEX_CREATED
(similar lines repeated - cut off due to lack of space for the question)
nessus-candidates-2016-12-07 0 r UNASSIGNED INDEX_CREATED
index                        0 r UNASSIGNED NODE_LEFT
rwsnsr1.html                 0 r UNASSIGNED NODE_LEFT
nessus-candidates-2016-12-01 2 r UNASSIGNED INDEX_CREATED
(...)
nessus-candidates-2016-12-04 2 r UNASSIGNED INDEX_CREATED
nessus_logs                  0 r UNASSIGNED NODE_LEFT
spipe                        0 r UNASSIGNED NODE_LEFT
phppath                      0 r UNASSIGNED NODE_LEFT
ogrnge1.html                 0 r UNASSIGNED NODE_LEFT
nessus                       1 r UNASSIGNED NODE_LEFT
webui                        0 r UNASSIGNED NODE_LEFT
rossae1.html                 1 r UNASSIGNED NODE_LEFT
rossae1.html                 0 r UNASSIGNED NODE_LEFT
nvseeo1.html                 0 r UNASSIGNED NODE_LEFT
perl                         0 r UNASSIGNED NODE_LEFT
dummy_index                  1 r UNASSIGNED NODE_LEFT
dummy_index                  0 r UNASSIGNED NODE_LEFT
enarow1.html                 0 r UNASSIGNED NODE_LEFT
wvarrr1.html                 2 r UNASSIGNED NODE_LEFT
wvarrr1.html                 0 r UNASSIGNED NODE_LEFT
gawvoo1.html                 2 r UNASSIGNED NODE_LEFT
gawvoo1.html                 0 r UNASSIGNED NODE_LEFT
logs                         1 r UNASSIGNED NODE_LEFT
logs                         0 r UNASSIGNED NODE_LEFT
gnopgn1.html                 1 r UNASSIGNED NODE_LEFT
(...)

I had to restart eu4 for unrelated reasons - I had 15 unassigned shards before the reboot, it is now 62. All the unassigned shards before the reboot had the INDEX_CREATED reason. After rebooting, the new unassigned have the NODE_LEFT reason.

EDIT: I added a new node to the cluster to check if this would help. It is visible in the cluster but does not get any shards allocated (the unassigned shards remain unassigned and the new node is empty)

enter image description here


(Mark Walkom) #2

Did you disable allocation?

Check _cluster/settings


(Christian Dahlqvist) #3

Are all nodes using the same Elasticsearch version?


(Mark Walkom) #4

And do they have enough disk free.


#5

@Christian_Dahlqvist @warkolm

EDIT: my apologies - they indeed run the same release and major version, but the minor one was different (2.4.1 vs 2.4.2). after upgrading everything is now fine


The allocation is not disabled:

{
   "persistent": {},
   "transient": {
      "cluster": {
         "routing": {
            "allocation": {
               "enable": "true",
               "disable_allocation": "false"
            }
         }
      }
   }
}  

All the nodes run the same version of elasticsearch and all have ample disk space.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.