Hi
Currently im testing 7.2 from a zip file in a development environment on Windows 10.
I apologize if this has been thoroughly explained elsewhere but Ive looked at a few of the old forced awareness topics here and i think i severely still dont understand how it works or is supposed to work. I understand how allocation awareness works with attributes, and filtering with includes, excludes but just not the force.*.values
.
Ive looked at AwarenessAllocationDecider
, FilterAllocationDecider
and AllocationDecider
and i do believe my prior assumption to how Forced awareness should work is incorrect, im hoping someone can clarify my misunderstanding.
My initial tests and assumptions are such
a) I have 7 nodes within cluster1 with node.attr.zone
respectively a,b,c,d,e,f,g for node-1 to node-7.
b) I have an index newindex
with 1 + 1 replica
c) I set "transient":{ "cluster.routing.allocation.awareness.attributes": "zone"}
d) I set "transient": {"cluster.routing.allocation.awareness.force.zone.values": "a,b"}
. My assumption is that the failover shards will be recreated on nodes with zone values of a or b (respectively node-1 and node-2).
e) I check which nodes the p and r shards are assigned to, and then hopefully its not node-1 nor node-2 and i shut down the newindex
replica node.
f) I expect that the newindex
replica shard will be recreated in either node-1 or node-2 corresponding to zone=a,b
. However it doesnt seem to be the case, it just recreates the replica on any available node with the zone attribute.
g) Looking at AwarenessAllocationDecider
it only actually checks for available number of nodes, i havent looked at the routing code yet, but i guess im expecting that somewhere its actually checking for the value of the attribute within the nodes (in this case there are only two nodes available instead of seven) for the decider to actually make that call. But specifically in AwarenessAllocationDecider
i dont see that. Which makes me realize either i have a fundamental misunderstanding of how 'forced awareness' is supposed to work or there is a bug.
Hence questions:
i) Why are the force.*.values
required if the values arent used? It seems just providing the awareness.attributes
would be sufficient
ii) Is it supposed to account for primaries or replicas or all shards?
iii) How is it modified by other settings? eg allow_rebalance etc
iv) How is it supposed to be used with allocation filtering, include/exclude/require? It seems to be that just assigning include seems to be sufficient to recreate an a failed shard
Any hints will be very valuable, thank you
Aziz