Watcher to check if a cluster node is online

The Kibana UI for elasticsearch cluster renders as follows

here i need to apply watcher on the node status so that whenever a node is offline i get a email trigger.
Inputs are highly appriciated

Basically i want to trigger an alert as soon as a cluster node goes down.
The monitoring data produced for es nodes do not tell about the status of node, if it is healthy or not.
Need a way to determine if a node is healthy and set a watcher accordingly.

Hi there,

One possible solution could check to see if the number of nodes is less than you expect. If you have a cluster of 3 nodes then you could set up a watch like this:

Will this work? Or do you need to know when a specific node has become unavailable?

Thanks,
CJ

Thanks @cjcenizal
This do gives us the number of ES nodes that are down but i dont think this would be telling us the name and ip of the nodes that are down in the email.
Any clue how we can achieve that?
Regards

Unfortunately, I don't think there's any way to get more granularity than this in your watch.

If you are willing to not use the watcher UI (as that one is based on aggregates as you can see the count() function above), and have a term aggregation on hostnames from the monitoring data, you could compare two time windows against each other and find out which hosts have been there 5-10 minutes ago vs those which have been there 0-5 minutes ago. If those are different, trigger an alert.

This requires you to write some thorough queries manually first, before writing a watch. basically querying the monitoring data from the last ten minutes, then filter into two buckets (now-5m and 5-10m ago), and then aggregate on all the hostnames. It's not a task you do super quick, if you are not familiar with the query DSL, but you can try it out we can help over here.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.