Hey there,
I have a 7.17 ES cluster with the embedded infrastructure rules & alerts
enabled and the usage of ILM
for specific kind of indices.
I have enabled the "Cluster_health
" rule and I saw that when my ILM policy will shrink a specific index (monthly, from 3 shards to 1), the alert rule will be triggered with the message:
Cluster health alert is firing for xxx. Current health is yellow. Allocate missing replica shards.
So, I manually test the shrink phase and I saw that if I execute the command:
POST /test/_shrink/test_shrinked
the test_shrinked index will inherit the field "index.routing.allocation.require._name": "node-2"
This will cause the missing replica since both primary and replica shards of the test_shrinked index will be allocated on the same node (obviously this will not possible so replica will be unassigned).
On another hand, I used also another _shrink
command following official doc:
POST /test/_shrink/test_shrink
{
"settings": {
"index.routing.allocation.require._name": null,
"index.blocks.write": null
}
}
So, in the latter scenario I will not see "missing replica shards" message but at the beginning, the replica shard will be in "INITIALIZING" state so cluster_health
will be yellow and alert will be triggered anyway.
Is there any method to make the rule less susceptible or to avoid this behavior? For example, is there a method to trigger the alert only after 10 minutes?