We are using metricbeat/ELK for monitoring our kubernetes (EKS) cluster. I would like to create an alert if the node is down. May I know how to do it?
Because when the VM is down, the metricbeat daemonset can not send the data and how can I create alert that the node is down.
Alert when the telemetry (metricbeat doc count = 0 for some time interval group by host) is stopped sending. This tells you that metricbeat is not shipping telemetry but does is not necessarily mean the VM is down, perhaps a good indicator but not necessarily.
Another way to do this is use heartbeat and up time to ping to be VM externally to see if it is up and on the network.
2 related but slightly different approaches.
I asked about the version because the newer versions with the kibana alerting have an easy way alert when the telemetry stops
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.