New replica are not getting assigned

Dhara_Desai · July 17, 2017, 6:44pm

I use kopf for visualization, and I tried changing the number of replica setting for an index from 17 to 20, making the replication group of 21 using kopf. (total 20 primary shards + 3 availability zones)
The observation is, it assigns some number of replica's (mainly group of 60) but others stay unassigned. Any pointers on debugging this issue ?

Elasticsearch version 1.7

outcoldman · July 17, 2017, 8:06pm

You can try Explain API, which can tell you why it could not assign some shards.

Dhara_Desai · July 18, 2017, 9:45pm

I don't think this API is available in 1.7. Any other way to figureout the issue ?

Dhara_Desai · July 18, 2017, 9:51pm

To share more information, it mainly fails on "too many shards on nodes for attribute: [aws_availability_zone]" but I can clearly see there are many hosts in availability zone without the shards. The hack that I use to force allocate is, I increase the number of shards to relatively high value and cluster picks bunch of shards for allocation, I then reduce the replicas back to what is required and this solves the problem of unassigned shards. The process is very annoying when you have many indices

warkolm · July 18, 2017, 11:01pm

Why do you have so many replicas?

Dhara_Desai · July 19, 2017, 1:48am

becasue we need more replicas for a business requirement, is there an issue with more number of replicas in 1.7 ?

warkolm · July 19, 2017, 1:49am

No, it's just unusual to see many people running that many

Dhara_Desai · July 19, 2017, 6:52am

We have following settings for force awareness
cluster.routing.allocation.awareness.force.availability_zone.values: zone1, zone2, zone3
cluster.routing.allocation.awareness.attributes: availability_zone

This will get applied to assigning and relocating shards, but will it include new replicas added to the replication group while calculating "shardPerAttribute"?

Dhara_Desai · July 19, 2017, 7:37am

adding a little more analysis,
with 11 replica + 20 primary = total 240 shards
Zone 1 - 73 assigned / 7 unassigned
Zone 2 - 65 assigned / 15 unassigned
Zone 3 - 78 assigned / 2 unassigned
I tried manually rerouting one of the shards to host in each zone
NO(too many shards on nodes for attribute: [availability_zone]

I also ran reroute with explain to get unassigned_info, "reason": "NODE_LEFT" which is expected.

I am curious, what happens with the number of hosts in all these zones are not equal? will that create any imbalance in assigning shards ? our index setting for "total_shards_per_node" is default

Christian_Dahlqvist · July 19, 2017, 7:40am

I am a bit confused. How many indices do these 240 shards belong to? How many primary shards do each index have? What is the number of replicas set to for these indices? How many data nodes do you have per availability zone?

Dhara_Desai · July 19, 2017, 7:45am

This is the analysis of 1 index with 20 primary and 11 replica and around 80 data nodes in each zone. we do have more indices other than the mentioned.

Christian_Dahlqvist · July 19, 2017, 7:56am

If I have understood your configuration correctly, I would expect Elasticsearch to only assign the primary shard and 2 replicas of that shard as you have defined awareness of 3 zones. That is the purpose of shard allocation awareness according to the docs. Do all nodes have the allocation awareness parameters configured?

Dhara_Desai · July 19, 2017, 8:04am

Yes all nodes have allocation awareness parameter set, I verified that.
Are you saying with 3 zones, we can only have 1primary + 2 replica setting? what if we have more replicas, what is the expected behavior ?

Christian_Dahlqvist · July 19, 2017, 8:10am

I would expect those replicas to be unassigned. If you wanted to have 5 replicas (6 copies of each shard), you could divide up a zone into parts, e.g. zone1a, zone1b, zone2a, zone2b, zone3a and zone3b. If you leave out or alter the forced allocation parameter, Elasticsearch will try to allocate one shard per zone and will now be able to place one primary shard and 5 replicas. This is quite well explained in the example given here.

Dhara_Desai · July 19, 2017, 8:13am

That makes sense, Ill have to revist our Elasticsearch awareness settings.

Dhara_Desai · July 19, 2017, 8:35am

I am curious, with current force settings if I look at replication group of shard 3, it shoud have 9 unassigned shards but all are assigned evenly accross zones. Which is strange and not an expected behavior !

Christian_Dahlqvist · July 19, 2017, 8:42am

That is what surprised me too, and why I asked if all nodes have all parameters correctly set. It does sound strange.

Christian_Dahlqvist · July 19, 2017, 8:45am

Am curious where this error comes from, given that you have specified the allocation awareness attribute as just availability_zone. Is there a mismatch in the configuration?

Dhara_Desai · July 19, 2017, 8:46am

As per this conversation, I understood that removing "cluster.routing.allocation.awareness.force.availability_zone.values: zone1, zone2, zone3" might resolve this issue. Ill quickly test it as I dont see the need of this setting for now, because we need more than 2 replicas for sure.

Dhara_Desai · July 19, 2017, 8:47am

Sorry about that, there is no missmatch. The attribute is aws_availability_zone. And I observed this error when I run reroute for a perticular shard

Topic		Replies	Views
Unassigned replica shards after creation of an index, even though there is enough space Elasticsearch	2	2947	September 27, 2017
Unassigned shards after adding Shard Allocation Awareness Elasticsearch	4	462	December 7, 2016
Unassigned shards due to awareness allocation Elasticsearch	2	841	November 22, 2017
Shards not assigned when creating an index Elasticsearch	3	1049	July 5, 2017
Replica shards do not get assigned (not consistent), even though they can Elasticsearch	10	5037	April 15, 2018

New replica are not getting assigned

Related topics