I have a 3 node elasticsearch cluster. all where ok, but until I created a new index and the newly created index has unallocated shards,
core@client-cluster-1:~$ curl http://localhost:19268/_cat/shards?v
index shard prirep state docs store ip node
site-id 3 p STARTED 0 159b 10.0.0.6 config.rets.ci-client-cluster-2
site-id 3 r STARTED 0 159b 10.0.0.7 config.rets.ci-client-cluster-3
site-id 4 p STARTED 0 159b 10.0.0.5 config.rets.ci-client-cluster-1
site-id 4 r UNASSIGNED
site-id 2 r STARTED 0 159b 10.0.0.5 config.rets.ci-client-cluster-1
site-id 2 p STARTED 0 159b 10.0.0.7 config.rets.ci-client-cluster-3
site-id 1 p STARTED 0 159b 10.0.0.5 config.rets.ci-client-cluster-1
site-id 1 r UNASSIGNED
site-id 0 p STARTED 0 159b 10.0.0.6 config.rets.ci-client-cluster-2
site-id 0 r STARTED 0 159b 10.0.0.7 config.rets.ci-client-cluster-3
I have tried removing the replicate then read it back and it goes straight back to unassigned mode.
if I try rerouting the shard the documentation is showing a from_node but since the shard is unassigned there is no from_node.
}'
{"error":{"root_cause":[{"type":"json_parse_exception","reason":"Unexpected character ('{' (code 123)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name\n at [Source: org.elasticsearch.transport.netty.ChannelBufferStreamInput@469fd885; line: 3, column: 10]"}],"type":"json_parse_exception","reason":"Unexpected character ('{' (code 123)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name\n at [Source: org.elasticsearch.transport.netty.ChannelBufferStreamInput@469fd885; line: 3, column: 10]"},"status":500}core@
I got a different error:
{"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[config.rets.ci-client-cluster-2][10.0.0.6:19368][cluster:admin/reroute]"}],"type":"illegal_argument_exception","reason":"[allocate] allocation of [site-id][1] on node {config.rets.ci-client-cluster-2}{JWGyMaFESxWE-xsy1baMhQ}{10.0.0.6}{10.0.0.6:19368} is not allowed, reason: [YES(allocation disabling is ignored)][YES(allocation disabling is ignored)][YES(shard is not allocated to same node or host)][YES(shard not primary or relocation disabled)][YES(no allocation awareness enabled)][YES(enough disk for shard on node, free: [386.3gb])][YES(primary is already active)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(node passes include/exclude/require filters)][NO(target node version [2.3.3] is older than source node version [2.3.4])][YES(below shard recovery limit of [2])]"},"status":400}
There you have your answer though. You have a mixed-version cluster (some nodes have ES v2.3.3 and others ES v2.3.4). Having a primary shard on a newer node does not allow the replica to be allocated to an older node. Upgrade all nodes to the same version.
Managed to fix it finally. After stopping two out of the 3 nodes relocating the unassigned shards on the online node. Then restarted the nodes one by one.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.