Elasticsearch cluster leaving shards unassigned

We're running an elasticsearch cluster for logging, indexing logs from
multiple locations using logstash. We recently added two additional nodes
for additional capacity whilst we await further hardware for the cluster's
expansion. Ultimately we aim to have 2 nodes for "realtime" data running on
SSDs to provide fast access to recent data, and ageing the data over to
HDDs for older indicies. The new nodes we put in had a lot less memory than
the existing boxes (700GB vs 5TB), but given this will be similar to the
situation we'd have when we implemented SSDs, I didn't forsee it being much
of a problem.

As a first attempt, I threw the nodes into the cluster trusting the new
Disk spaced based allocation rules would mean they wouldn't instantly get
filled up. This unfortunately wasn't the case, I awoke to find the cluster
had merrily reallocated shards onto the new nodes, in excess of 99%. After
some jigging of settings I managed to remove all data from these nodes and
return the cluster to it's previous state (all shards assigned, cluster
state green).

As a next approach I tried to implement index/node tagging similar to my
plans for when we implement SSDs. This left us with the following
configuration:

  • Node 1 - 5TB, tags: realtime, archive
  • Node 2 - 5TB, tags: realtime, archive
  • Node 3 - 5TB, tags: realtime, archive
  • Node 4 - 700GB, tags: realtime
  • Node 5 - 700GB, tags: realtime

(all nodes running elasticsearch 1.3.1 and oracle java 7 u55)

Using curator I then tagged indicies older than 10days as "archive" and
more recent ones "realtime". This in the background sets the index shard
allocation "Require". Which my understanding is it will require the node to
have the tag, but not ONLY that tag.

Unfortunately this doesn't appeared to have had the desired effect. Most
worryingly, no indices tagged as archive are allocating their replica
shards, leaving 295 unassigned shards. Additionally the realtime tagged
indicies are only using nodes 4, 5 and oddly 3. Node 3 has no shards except
the very latest index and some kibana-int shards.

If I remove the tags and use exclude._ip to pull shards off the new nodes,
I can (slowly) return the cluster to green, as this is the approach I took
when the new nodes had filled up completely, but I'd really like to get
this setup sorted so I can have confidence the SSD configuration will work
when the new kit arrives.

I have attempted to enable: cluster.routing.allocation.allow_rebalance to
always, on the theory the cluster wasn't rebalancing due to the unassigned
replicas. I've also tried: cluster.routing.allocation.enable to all, but
again, this has had no discernable impact.

Have I done something obviously wrong? Or is there disagnostics of some
sort I could use? I've been visualising the allocation of shards using
Elasticsearch Head plugin.

Any assistance would be appreciated, hopefully it's just a stupid mistake
that I can fix easily!

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/73ada837-c455-4d47-8583-f7fd753f76e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Did you solve the problem, i am stuck with same issue. Any luck ??