I had a 4 node cluster, 2 indices (5 and 10 shards respectively) each with 2 replicas.
I removed one of the nodes from the cluster by shutting it down - it's gone. This was done by first calling _cluster/nodes//shutdown and then killing the elasticsearch process.
But now I see UNASSIGNED shards sitting around and not being distributed to the remaining 3 nodes. enable_allocation is true.
Is there something wrong in how I remove the node from the cluster?
$ curl "localhost:25700/_cat/nodes?v"
host ip heap.percent ram.percent load node.role master name
10.2.34.181 10.2.34.181 17 99 15.48 d m 130593666698
10.2.34.185 10.2.34.185 19 99 19.44 d m 130593668838
10.2.34.183 10.2.34.183 14 99 12.65 d * 130593666690
I don't see any pending tasks - I'd assume ES would be moving some shards, but it's not.
can you attach the full cluster state here (or put it on pastebin)? /_cluster/state?pretty
If it contains confidential information, please message me a private link.
I do have the shards state in my terminal. [_cluster/state was unfortunately piped to less]
I also realised that _shutdown is removed - so the node removal was basically done by stopping the ES process on 10.2.34.179
$ curl 'localhost:25700/_cat/shards?v'
index shard prirep state docs store ip node
cfileindex 9 p STARTED 886853 106.4mb 10.2.34.183 130593666690
cfileindex 9 r STARTED 886853 106.4mb 10.2.34.185 130593668838
cfileindex 9 r STARTED 886853 106.4mb 10.2.34.181 130593666698
cfileindex 2 p STARTED 886407 106.2mb 10.2.34.185 130593668838
cfileindex 2 r STARTED 886407 106.2mb 10.2.34.181 130593666698
cfileindex 2 r UNASSIGNED
cfileindex 5 p STARTED 886642 106.7mb 10.2.34.185 130593668838
cfileindex 5 r STARTED 886642 106.7mb 10.2.34.181 130593666698
cfileindex 5 r UNASSIGNED
cfileindex 8 p STARTED 885850 106.6mb 10.2.34.183 130593666690
cfileindex 8 r STARTED 885850 106.6mb 10.2.34.185 130593668838
cfileindex 8 r STARTED 885850 106.6mb 10.2.34.181 130593666698
cfileindex 7 p STARTED 885387 106.5mb 10.2.34.183 130593666690
cfileindex 7 r STARTED 885387 106.5mb 10.2.34.185 130593668838
cfileindex 7 r STARTED 885387 106.5mb 10.2.34.181 130593666698
cfileindex 6 p STARTED 886446 106.3mb 10.2.34.183 130593666690
cfileindex 6 r STARTED 886446 106.3mb 10.2.34.185 130593668838
cfileindex 6 r UNASSIGNED
cfileindex 1 p STARTED 886537 106.3mb 10.2.34.183 130593666690
cfileindex 1 r STARTED 886537 106.3mb 10.2.34.181 130593666698
cfileindex 1 r UNASSIGNED
cfileindex 3 p STARTED 884795 106.3mb 10.2.34.183 130593666690
cfileindex 3 r STARTED 884795 106.3mb 10.2.34.185 130593668838
cfileindex 3 r UNASSIGNED
cfileindex 4 p STARTED 885088 106.2mb 10.2.34.183 130593666690
cfileindex 4 r STARTED 885088 106.2mb 10.2.34.181 130593666698
cfileindex 4 r UNASSIGNED
cfileindex 0 p STARTED 885674 106.3mb 10.2.34.183 130593666690
cfileindex 0 r STARTED 885674 106.3mb 10.2.34.185 130593668838
cfileindex 0 r UNASSIGNED
objindex 4 p STARTED 6 55.7kb 10.2.34.183 130593666690
objindex 4 r STARTED 6 55.7kb 10.2.34.185 130593668838
objindex 4 r STARTED 6 55.7kb 10.2.34.181 130593666698
objindex 3 p STARTED 10 111.9kb 10.2.34.183 130593666690
objindex 3 r STARTED 10 111.9kb 10.2.34.181 130593666698
objindex 3 r UNASSIGNED
objindex 1 p STARTED 3 52kb 10.2.34.183 130593666690
objindex 1 r STARTED 3 52kb 10.2.34.185 130593668838
objindex 1 r UNASSIGNED
objindex 2 p STARTED 9 134.2kb 10.2.34.183 130593666690
objindex 2 r STARTED 9 134.2kb 10.2.34.181 130593666698
objindex 2 r UNASSIGNED
objindex 0 p STARTED 8 59.5kb 10.2.34.185 130593668838
objindex 0 r STARTED 8 59.5kb 10.2.34.181 130593666698
objindex 0 r UNASSIGNED
I'll update the thread in case this reproduces - please let me know if there's anything more I can capture at this stage. It looks like a bug to me that shards were not being assigned.
Once it reproduces, capture the cluster state and try the reroute command with "explain" on an unassigned shard by trying to allocate it to a node that does not have that shard.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.