Unassigned shard on Logstash index rollover


(Neil Prosser) #1

I have a cluster with five nodes and I've specified that I'd like to
allocate a maximum of two shards to each node for indexes which are created
with five shards and two replicas of each (making ten shards to allocate in
total for each index). Occasionally I find that, at midnight when the new
index is created, a shard stays unassigned. For example, yesterday's index
was allocated as follows (brackets indicate primary shard):

node01 - (2) 3
node02 - 0 (4)
node03 - 2 (3)
node04 - (0) (1)
node05 - 1 4

This morning the new index was allocated as follows:

node01 - (2) 3
node02 - 2 (3)
node03 - (4)
node04 - 0 (1)
node05 - (0) 1
unassigned - 4

I now have to go in and manually move a shard from one of the nodes and the
shard is then allocated.

The settings for the indices are:

settings: {
index.analysis.analyzer.url_path_analyzer.type: custom
index.query.default_field: message
index.number_of_replicas: 1
index.number_of_shards: 5
index.auto_expand_replicas: false
index.routing.allocation.total_shards_per_node: 2
index.store.compress.tv: true
index.analysis.tokenizer.url_path_tokenizer.type: path_hierarchy
index.store.compress.stored: true
index.analysis.tokenizer.url_path_tokenizer.delimiter: /
index.cache.field.type: soft
index.analysis.analyzer.url_path_analyzer.tokenizer: url_path_tokenizer
index.version.created: 901199
index.uuid: XRooj-ZmRe2c58uYDzsMFQ
}

It's not using the standard Logstash settings (Logstash's elasticsearch
output is set to manage_templates => false).

Does anyone have any ideas as to what I've done wrong that is likely to be
causing these issues?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/731c80d3-c6e0-4afe-b45d-92c70d774e2a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #2

Very odd indeed. May I ask which version of ES are you using, and also
how/where are you setting this:
"index.routing.allocation.total_shards_per_node: 2"? I want to see if I can
duplicate your behavior.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/755a730f-e2da-49b9-b330-bfbba2bdf9b1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Neil Prosser) #3

I'm using 0.90.11 but I have seen this behaviour in older versions.

The total_shards_per_node setting is set via a mapping file living in
$ES_HOME/config/templates/logstash.json.

The relevant part of the mapping is:

{
"logstash" : {
"template" : "logstash-*",
"settings" : {
"index" : {
...
"routing" : { "allocation" : { "total_shards_per_node" : 2
} }
}
},
"mappings" : {
...
}
}
}

If you need any other information let me know. I still have the cluster in
a yellow state. It also did the same thing this morning.

node01 - 2 (3)
node02 - (4)
node03 - (0) 1
node04 - (2) 3
node05 - 0 (1)
unassigned - 4

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6691e9c0-aff9-4fee-98f5-322bdeac7a75%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Neil Prosser) #4

On the 19th it left the replica of shard 4 unassigned:

node01 - (4)
node02 - 0 (1)
node03 - (0) 1
node04 - 2 (3)
node05 - (2) 3
unassigned 4

This morning however, it's allocated all the shards:

node01 - (0) 1
node02 - (2) 3
node03 - (1) 2
node04 - 0 (4)
node05 - (3) 4

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c3e5861f-1b82-49d2-bcfa-8813fb3421ef%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #5

Neil, I have been trying to reproduce but I can't seem to (0.90.11 and
1.0.0). Perhaps is it possible for you to look at your logs on all nodes
and see if there is anything there that might pinpoint/relate to this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f6471b74-4695-4661-92a9-bf796af48f0c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Neil Prosser) #6

Sure, looking at the top of the logs for the 17th (when the problem
started) I can see...

[2014-02-17 00:00:00,278][INFO ][cluster.metadata ] [Caretaker]
[logstash-2014.02.17] creating index, cause [auto(bulk api)], shards
[5]/[1], mappings [default]
[2014-02-17 00:00:00,336][WARN ][transport.netty ] [Caretaker]
Message not fully read (request) for [5684538] and action
[cluster/nodeIndexCreated], resetting
[2014-02-17 00:00:00,336][WARN ][transport.netty ] [Caretaker]
Message not fully read (request) for [5358797] and action
[cluster/nodeIndexCreated], resetting
[2014-02-17 00:00:00,336][WARN ][transport.netty ] [Caretaker]
Message not fully read (request) for [6186213] and action
[cluster/nodeIndexCreated], resetting

... before it gets into the usual stuff about updating dynamic mappings for
the types I've got. It only occurs in the logs on one of the nodes and it's
the node which is currently the master (and I haven't purposely changed
that by restarting nodes or anything similar since then). The logs on the
other nodes don't pick up until 09:00 when I was restarting the Logstash
processes which had joined the cluster.

That pattern of logs actually occurs every day when the new index is
created (given the message content I'm assuming that's the case).

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8a9970ce-563c-4dd5-a14b-55079018aedf%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #7

Hmmm, I'm assuming you are running LS 1.3.3 and using the elasticsearch
output. I'm wondering if you can use the elasticsearch_http output instead
and see if that makes any difference. I am very curious to know if it works
or not. So something like this in the LS config:

output {
#elasticsearch {

host => "localhost"

port => 9300

#}
elasticsearch_http {
host => "localhost"
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/48b1eaae-9796-42b5-841d-99002e286518%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #8