[v1.5.1] Replica shard stuck initializing and can't read stats for primary shard

Nick_Pentreath · September 17, 2015, 7:23am

Hi,

[Using Elasticsearch 1.5.1]

I currently have an issue where one of 5 primary shards in an index is stuck in INITIALIZING state (for well over 24 hrs now). The primary shard is marked as STARTED but I cannot retrieve stats for that shard.

Output of cat health:

epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks
1442473639 07:07:19  BBB           yellow          3         3     63  30    0    2        0             0

Output of cat shards:

index       shard prirep state       docs store ip            node
AAA_1       1     p      STARTED                _____________ es-live-1
AAA_1       1     r      INITIALIZING           _____________ es-live-3
AAA_1       1     r      INITIALIZING           _____________ es-live-2

Output of another index which is fine - where I can see the shard stats

index       shard prirep state       docs store ip            node
graphflow_1 4     p      STARTED  5071499 1.9gb _____________ es-live-1
graphflow_1 4     r      STARTED  5071499 1.9gb _____________ es-live-2
graphflow_1 0     p      STARTED  4620643 1.6gb _____________ es-live-1
graphflow_1 0     r      STARTED  4620643 1.6gb _____________ es-live-2
...

I also get this:

[2015-09-17 07:17:53,082][DEBUG][action.admin.cluster.stats] [es-live-1] failed to execute on node [-gLPPrH_R4i5RFKYoeXO3w]
org.elasticsearch.index.engine.EngineClosedException: [AAA_1][1] CurrentState[CLOSED]

Originally I was getting a lot of timeouts and some GC errors on the node that held the PRIMARY of the relevant shard. The node was unresponsive and I had to restart it. Since then the cluster has been yellow with this issue.

Search & aggregations seem to be working. But when I try to run a scan-scroll (using elasticsearch-hadoop for bulk analytics jobs), I get

SearchPhaseExecutionException[Failed to execute phase [init_scan], all shards failed]

Any help appreciated.

warkolm · September 19, 2015, 1:12am

Try dropping the replica and then adding it back.

Nick_Pentreath · September 20, 2015, 8:01am

Thanks - that worked. Any idea on the cause for this, and is it something
fixed in later versions?

warkolm · September 20, 2015, 8:32am

Could have been a few things, turning up logging will give you a better idea of what the cause is if it happens again.

Topic		Replies	Views
Shard stuck in INITIALIZING state Elasticsearch	2	14281	June 17, 2017
Shards stuck in INITIALIZATION due to mismatch in State on targetnode Elasticsearch	2	405	November 28, 2019
One of the Shard stuck at INITIALIZING state Elasticsearch	4	1960	July 5, 2017
Shards stuck in Initializing mode Elasticsearch	3	3961	July 5, 2017
Shard stuck in INITIALIZING Elasticsearch	5	5669	July 5, 2017

[v1.5.1] Replica shard stuck initializing and can't read stats for primary shard

Related topics