Hi guys, I would really appreciate some help understanding what's going
down with shard allocation in this case:
Elasticsearch version: 1.4.4
We had 3 nodes with 1 shard and 1 replica per index (so net 2 copies of
everything). 1 node went down and the cluster went red. It started to
reallocate shards as expected and there were originally ~50 unallocated
shards with 15 primary and the rest replicas.
It's been a few hours now and there are still 15 outstanding shards that
are all primary that don't seem to be getting re-allocated. I thought this
would be a pretty standard scenario so I was really hoping I wouldn't need
to manually walk through and re-allocate the primary shards, but I'm not
sure what else to try at this point to get back to green. Any pointers
would be really appreciated. Here is some of the relevant seeming bits
folks asked about on the IRC:
In the ES logs for the unallocated index names there are lines along the
line of
[2015-04-29 22:08:22,803][DEBUG][action.admin.indices.stats] [Agent Axis]
[webaccesslogs-2015.04.24][0], node[-r2iQnH4R-mcUy4NicCB5g], [P],
s[STARTED]: failed to execute
[org.elasticsearch.action.admin.indices.stats.IndicesStatsRequest@6a564a91]
org.elasticsearch.transport.SendRequestTransportException: [Jean-Paul
Beaubier][inet[/10.155.165.126:9300]][indices:monitor/stats[s]]
"Jean-Paul Beaubier" is the node that went down
I'm trying to understand why it's stuck in this state given there is no
other info in the logs as far as I can tell about why the shards can't be
allocated. Shouldn't the replicas just be promoted in place to new
primaries and then new replicas created on the other node?
Probably super evident but the output above was actually from
_cat/allocation?v not /recovery, sorry about that.
On Wednesday, April 29, 2015 at 5:19:08 PM UTC-7, Alex Schokking wrote:
Hi guys, I would really appreciate some help understanding what's going
down with shard allocation in this case:
Elasticsearch version: 1.4.4
We had 3 nodes with 1 shard and 1 replica per index (so net 2 copies of
everything). 1 node went down and the cluster went red. It started to
reallocate shards as expected and there were originally ~50 unallocated
shards with 15 primary and the rest replicas.
It's been a few hours now and there are still 15 outstanding shards that
are all primary that don't seem to be getting re-allocated. I thought this
would be a pretty standard scenario so I was really hoping I wouldn't need
to manually walk through and re-allocate the primary shards, but I'm not
sure what else to try at this point to get back to green. Any pointers
would be really appreciated. Here is some of the relevant seeming bits
folks asked about on the IRC:
In the ES logs for the unallocated index names there are lines along the
line of
[2015-04-29 22:08:22,803][DEBUG][action.admin.indices.stats] [Agent Axis]
[webaccesslogs-2015.04.24][0], node[-r2iQnH4R-mcUy4NicCB5g], [P],
s[STARTED]: failed to execute
[org.elasticsearch.action.admin.indices.stats.IndicesStatsRequest@6a564a91]
org.elasticsearch.transport.SendRequestTransportException: [Jean-Paul
Beaubier][inet[/10.155.165.126:9300]][indices:monitor/stats[s]]
"Jean-Paul Beaubier" is the node that went down
I'm trying to understand why it's stuck in this state given there is no
other info in the logs as far as I can tell about why the shards can't be
allocated. Shouldn't the replicas just be promoted in place to new
primaries and then new replicas created on the other node?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.