I would like to set up a watcher that triggers an action (email for example) when one of the cluster nodes is down.
From the documentation I know, that it's possible to monitor cluster status from /_cluster/health. But this one is not precise enough for me:
- status
yellowdoes not always mean, that the node is down - attribute
number_of_nodesis not reliable: I do not want to hardcode any values in the watcher trigger because it will need a change when new nodes are added
Any ideas on how to achieve such a metric?
Does active_shards_percent_as_number reflect number of active nodes directly?