Alerts on Missing heartbeats

alerting

(Joseph Presley) #1

We would like to alert when there is a heartbeat missing in our logs and we would like to create a visualization for the status of our connections. We know a heartbeat is successful when we log a string in our logs. There are multiple heartbeats spread across multiple log files. We have not created a predetermined list of connections and nor do we have a map for our log file naming conventions. We would expect one connection per log file.

The strings within a log file are:
35=A is a login
35=5 is a logout
35=0 is a heartbeat.

What's the general approach we can take for this type of monitoring? I'm comfortable enough with the Query DSL, Visual Builder, and Watcher but have yet to figure out how to visualize or alert when there is an absence of something for a time series.


(Mark Walkom) #2

If you expect a heartbeat every (eg) 1 minutes, then you can simply check the last 90 seconds (just to give you a bit of room) and if there's nothing found generate an alert?


(Joseph Presley) #3

The challenge is that I'm not sure whether there's a way to get current values to compare whether one is missing. Or if I need a whiteliist, I'm not sure how to visualize missing data.


(Mark Walkom) #4

That's why there was an implied question there, how often do you expect heartbeats.


(Joseph Presley) #5

For each socket, we expect heartbeats every 30s to a 1 minute.


(Mark Walkom) #6

Ok, then you can use something that looks for hosts that have no values in the last N seconds, as that implies a missing heartbeat.

Something like https://github.com/elastic/examples/tree/master/Alerting/Sample%20Watches/monitoring_cluster_health or https://github.com/elastic/examples/tree/master/Alerting/Sample%20Watches/new_process_started