Unassigned Shards

Engagor · June 4, 2011, 2:44pm

We currently have roughly 4000 indices with 3 shards each. Eeach index
will contain approx 100k docs.

I have been testing with the setting for
node_initial_primaries_recoveries. Checking the health then, only
makes a difference with initializing nodes:
{
"active_primary_shards": 404,
"active_shards": 404,
"cluster_name": "engagor",
"initializing_shards": 300,
"number_of_data_nodes": 3,
"number_of_nodes": 3,
"relocating_shards": 0,
"status": "red",
"timed_out": false,
"unassigned_shards": 12766
}

Shards still become active very slowly (roughly once every 5 to 7
seconds accross the three nodes)

In the logs there seems to be an issue with starting shards. (bouncing
back and forth between master / node)

See master logs for 1 specific index/shard:
[2011-06-04 16:42:20,289][DEBUG][cluster.action.shard ] [Washout]
received shard started for [shard3309-2011-3][0],
node[xIB_TnLmRX27a2m7U4n9kA], [P], s[INITIALIZING], reason [master
[Washout][xR0LOX4eS2WUnsoUFmfssg][inet[/10.10.10.2:9300]] marked shard
as initializing, but shard already started, mark shard as started]
[2011-06-04 16:42:25,044][DEBUG][cluster.action.shard ] [Washout]
received shard started for [shard3309-2011-3][0],
node[xIB_TnLmRX27a2m7U4n9kA], [P], s[INITIALIZING], reason [master
[Washout][xR0LOX4eS2WUnsoUFmfssg][inet[/10.10.10.2:9300]] marked shard
as initializing, but shard already started, mark shard as started]
[2011-06-04 16:42:29,966][DEBUG][cluster.action.shard ] [Washout]
received shard started for [shard3309-2011-3][0],
node[xIB_TnLmRX27a2m7U4n9kA], [P], s[INITIALIZING], reason [master
[Washout][xR0LOX4eS2WUnsoUFmfssg][inet[/10.10.10.2:9300]] marked shard
as initializing, but shard already started, mark shard as started]
[2011-06-04 16:42:34,770][DEBUG][cluster.action.shard ] [Washout]
received shard started for [shard3309-2011-3][0],
node[xIB_TnLmRX27a2m7U4n9kA], [P], s[INITIALIZING], reason [master
[Washout][xR0LOX4eS2WUnsoUFmfssg][inet[/10.10.10.2:9300]] marked shard
as initializing, but shard already started, mark shard as started]

For the same index/shard on one of the nodes:
[2011-06-04 16:42:13,064][DEBUG][cluster.action.shard ] [Vishanti]
sending shard started for [shard3309-2011-3][0],
node[xIB_TnLmRX27a2m7U4n9kA], [P], s[INITIALIZING], reason [master
[Washout][xR0LOX4eS2WUnsoUFmfssg][inet[/10.10.10.2:9300]] marked shard
as initializing, but shard already started, mark shard as started]
[2011-06-04 16:42:17,897][DEBUG][cluster.action.shard ] [Vishanti]
sending shard started for [shard3309-2011-3][0],
node[xIB_TnLmRX27a2m7U4n9kA], [P], s[INITIALIZING], reason [master
[Washout][xR0LOX4eS2WUnsoUFmfssg][inet[/10.10.10.2:9300]] marked shard
as initializing, but shard already started, mark shard as started]
[2011-06-04 16:42:22,637][DEBUG][cluster.action.shard ] [Vishanti]
sending shard started for [shard3309-2011-3][0],
node[xIB_TnLmRX27a2m7U4n9kA], [P], s[INITIALIZING], reason [master
[Washout][xR0LOX4eS2WUnsoUFmfssg][inet[/10.10.10.2:9300]] marked shard
as initializing, but shard already started, mark shard as started]
[2011-06-04 16:42:27,574][DEBUG][cluster.action.shard ] [Vishanti]
sending shard started for [shard3309-2011-3][0],
node[xIB_TnLmRX27a2m7U4n9kA], [P], s[INITIALIZING], reason [master
[Washout][xR0LOX4eS2WUnsoUFmfssg][inet[/10.10.10.2:9300]] marked shard
as initializing, but shard already started, mark shard as started]
[2011-06-04 16:42:32,378][DEBUG][cluster.action.shard ] [Vishanti]
sending shard started for [shard3309-2011-3][0],
node[xIB_TnLmRX27a2m7U4n9kA], [P], s[INITIALIZING], reason [master
[Washout][xR0LOX4eS2WUnsoUFmfssg][inet[/10.10.10.2:9300]] marked shard
as initializing, but shard already started, mark shard as started]

On Jun 4, 4:07 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Also, how many indices do you have? For such small indices, make sure to just allocate one shard per index (you have a lot of shards).

On Saturday, June 4, 2011 at 5:05 PM, Shay Banon wrote:

By default, it will throttle 4 concurrent primary allocation per node (which is the important one you wan to get to as fast as possible). You can set: cluster.routing.allocation.node_initial_primaries_recoveries to a higher value and it will cause more shards to be allocated concurrently.

This throttling is done so a machine will not be overloaded, it might make sense in your case to have a higher value.

On Saturday, June 4, 2011 at 4:11 PM, Engagor wrote:

Throtteling seems to be the issue I'm having. See the following Gist
for debug logs from the master:gist:1007894 · GitHub

The logs get spammed very fast with these kinds of entries.

Is there a setting I should change here?

Thanks in advance
Folke

Topic		Replies	Views
Slow startup (replica recovery in logs) Elasticsearch	11	1842	July 6, 2017
Shard Stuck in INITIALIZING and RELOCATING for more than 12 hours Elasticsearch	37	33983	January 25, 2019
Constant Recovering and Unassigned shards for an index Elasticsearch	12	1039	July 6, 2017
Shards remain "unassigned " after server restart Elasticsearch	5	1013	July 6, 2017
Shards stuck on initialising Elasticsearch	8	1314	July 5, 2017

Unassigned Shards

Related topics