Newly created index has unallocated shards


(Peter) #1

Hello,

I have a 3 node elasticsearch cluster. all where ok, but until I created a new index and the newly created index has unallocated shards,

core@client-cluster-1:~$ curl http://localhost:19268/_cat/shards?v
index shard prirep state docs store ip node
site-id 3 p STARTED 0 159b 10.0.0.6 config.rets.ci-client-cluster-2
site-id 3 r STARTED 0 159b 10.0.0.7 config.rets.ci-client-cluster-3
site-id 4 p STARTED 0 159b 10.0.0.5 config.rets.ci-client-cluster-1
site-id 4 r UNASSIGNED
site-id 2 r STARTED 0 159b 10.0.0.5 config.rets.ci-client-cluster-1
site-id 2 p STARTED 0 159b 10.0.0.7 config.rets.ci-client-cluster-3
site-id 1 p STARTED 0 159b 10.0.0.5 config.rets.ci-client-cluster-1
site-id 1 r UNASSIGNED
site-id 0 p STARTED 0 159b 10.0.0.6 config.rets.ci-client-cluster-2
site-id 0 r STARTED 0 159b 10.0.0.7 config.rets.ci-client-cluster-3

Please help me get back to green state.

Regards,
Peter


(Mark Walkom) #2

Do you have enough disk space on all the nodes?
Check the master logs, it might say something.


(Peter) #3

More than enough space on all nodes. And nothing in the logs.


(Mark Walkom) #4

Well a quick fix would be to remove the replica set and readd it.

However you can also try a reroute on those two shards to see if it works, and if not, why they aren't being allocated.


(Peter) #5

would you mind telling me how to do any of those steps?


(Mark Walkom) #6

https://www.elastic.co/guide/en/elasticsearch/guide/2.x/replica-shards.html
https://www.elastic.co/guide/en/elasticsearch/reference/5.1/cluster-reroute.html


(Peter) #7

Thanks,

I have tried removing the replicate then read it back and it goes straight back to unassigned mode.
if I try rerouting the shard the documentation is showing a from_node but since the shard is unassigned there is no from_node.


(Yannick Welsch) #8

Which ES version is this? If >= v5.0.0, have a look at the cluster allocation explain API


(Peter) #9

elasticsearch version is 2.3.4


(Yannick Welsch) #10

What's the output of running

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
    "commands" : [ {
        {
          "allocate_replica" : {
              "index" : "site-id", "shard" : 1, "node" : "config.rets.ci-client-cluster-2"
          }
        }
    ]
}'

what if you run the same command for config.rets.ci-client-cluster-3?


(Peter) #11
core@client-cluster-1:~$ curl -XPOST 'localhost:19268/_cluster/reroute' -d '{
"commands" : [ {
    {
      "allocate_replica" : {
          "index" : "site-id", "shard" : 1, "node" : "config.rets.ci-client-cluster-2"
      }
    }
]

}'
{"error":{"root_cause":[{"type":"json_parse_exception","reason":"Unexpected character ('{' (code 123)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name\n at [Source: org.elasticsearch.transport.netty.ChannelBufferStreamInput@469fd885; line: 3, column: 10]"}],"type":"json_parse_exception","reason":"Unexpected character ('{' (code 123)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name\n at [Source: org.elasticsearch.transport.netty.ChannelBufferStreamInput@469fd885; line: 3, column: 10]"},"status":500}core@


(Yannick Welsch) #12

yeah, had some typos, correct one is

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
    "commands" : [
        {
          "allocate" : {
              "index" : "site-id", "shard" : 1, "node" : "config.rets.ci-client-cluster-2"
          }
        }
    ]
}'

(Peter) #13

I got a different error:
{"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[config.rets.ci-client-cluster-2][10.0.0.6:19368][cluster:admin/reroute]"}],"type":"illegal_argument_exception","reason":"[allocate] allocation of [site-id][1] on node {config.rets.ci-client-cluster-2}{JWGyMaFESxWE-xsy1baMhQ}{10.0.0.6}{10.0.0.6:19368} is not allowed, reason: [YES(allocation disabling is ignored)][YES(allocation disabling is ignored)][YES(shard is not allocated to same node or host)][YES(shard not primary or relocation disabled)][YES(no allocation awareness enabled)][YES(enough disk for shard on node, free: [386.3gb])][YES(primary is already active)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(node passes include/exclude/require filters)][NO(target node version [2.3.3] is older than source node version [2.3.4])][YES(below shard recovery limit of [2])]"},"status":400}


(Yannick Welsch) #14

There you have your answer though. You have a mixed-version cluster (some nodes have ES v2.3.3 and others ES v2.3.4). Having a primary shard on a newer node does not allow the replica to be allocated to an older node. Upgrade all nodes to the same version.


(Peter) #15

I have downgraded the node which was different from the others now the cluster is red. With alot of unassigned shards.

curl http://localhost:19268/_cluster/health?pretty
 {
   "cluster_name" : "config.rets.ci",
   "status" : "red",
   "timed_out" : false,
   "number_of_nodes" : 3,
   "number_of_data_nodes" : 3,
   "active_primary_shards" : 6,
   "active_shards" : 12,
   "relocating_shards" : 0,
   "initializing_shards" : 0,
   "unassigned_shards" : 8,
   "delayed_unassigned_shards" : 0,
   "number_of_pending_tasks" : 0,
   "number_of_in_flight_fetch" : 0,
   "task_max_waiting_in_queue_millis" : 0,
   "active_shards_percent_as_number" : 60.0
 }

curl http://localhost:19268/_cat/shards
site-id      4 p UNASSIGNED                                                 
site-id      4 r UNASSIGNED                                                 
site-id      1 p UNASSIGNED                                                 
site-id      1 r UNASSIGNED                                                 
site-id      3 p STARTED    0 159b 10.0.0.6 config.rets.ci-client-cluster-2 
site-id      3 r STARTED    0 159b 10.0.0.7 config.rets.ci-client-cluster-3 
site-id      2 r STARTED    0 159b 10.0.0.6 config.rets.ci-client-cluster-2 
site-id      2 p STARTED    0 159b 10.0.0.7 config.rets.ci-client-cluster-3 
site-id      0 r STARTED    0 159b 10.0.0.6 config.rets.ci-client-cluster-2 
site-id      0 p STARTED    0 159b 10.0.0.7 config.rets.ci-client-cluster-3 
subscription 4 p UNASSIGNED                                                 
subscription 4 r UNASSIGNED                                                 
subscription 1 p UNASSIGNED                                                 
subscription 1 r UNASSIGNED                                                 
subscription 3 p STARTED    0 159b 10.0.0.6 config.rets.ci-client-cluster-2 
subscription 3 r STARTED    0 159b 10.0.0.7 config.rets.ci-client-cluster-3 
subscription 2 r STARTED    0 159b 10.0.0.6 config.rets.ci-client-cluster-2 
subscription 2 p STARTED    0 159b 10.0.0.7 config.rets.ci-client-cluster-3 
subscription 0 p STARTED    0 159b 10.0.0.6 config.rets.ci-client-cluster-2 
subscription 0 r STARTED    0 159b 10.0.0.7 config.rets.ci-client-cluster-3

(Peter) #16

Managed to fix it finally. After stopping two out of the 3 nodes relocating the unassigned shards on the online node. Then restarted the nodes one by one.

Thanks for your help.


(system) #17

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.