I have a cluster with five nodes and I've specified that I'd like to
allocate a maximum of two shards to each node for indexes which are created
with five shards and two replicas of each (making ten shards to allocate in
total for each index). Occasionally I find that, at midnight when the new
index is created, a shard stays unassigned. For example, yesterday's index
was allocated as follows (brackets indicate primary shard):
Very odd indeed. May I ask which version of ES are you using, and also
how/where are you setting this:
"index.routing.allocation.total_shards_per_node: 2"? I want to see if I can
duplicate your behavior.
Neil, I have been trying to reproduce but I can't seem to (0.90.11 and
1.0.0). Perhaps is it possible for you to look at your logs on all nodes
and see if there is anything there that might pinpoint/relate to this?
Sure, looking at the top of the logs for the 17th (when the problem
started) I can see...
[2014-02-17 00:00:00,278][INFO ][cluster.metadata ] [Caretaker]
[logstash-2014.02.17] creating index, cause [auto(bulk api)], shards
[5]/[1], mappings [default]
[2014-02-17 00:00:00,336][WARN ][transport.netty ] [Caretaker]
Message not fully read (request) for [5684538] and action
[cluster/nodeIndexCreated], resetting
[2014-02-17 00:00:00,336][WARN ][transport.netty ] [Caretaker]
Message not fully read (request) for [5358797] and action
[cluster/nodeIndexCreated], resetting
[2014-02-17 00:00:00,336][WARN ][transport.netty ] [Caretaker]
Message not fully read (request) for [6186213] and action
[cluster/nodeIndexCreated], resetting
... before it gets into the usual stuff about updating dynamic mappings for
the types I've got. It only occurs in the logs on one of the nodes and it's
the node which is currently the master (and I haven't purposely changed
that by restarting nodes or anything similar since then). The logs on the
other nodes don't pick up until 09:00 when I was restarting the Logstash
processes which had joined the cluster.
That pattern of logs actually occurs every day when the new index is
created (given the message content I'm assuming that's the case).
Hmmm, I'm assuming you are running LS 1.3.3 and using the elasticsearch
output. I'm wondering if you can use the elasticsearch_http output instead
and see if that makes any difference. I am very curious to know if it works
or not. So something like this in the LS config:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.