I'm currently working on migrating the functionality of an old element management system (Solarwinds) to Elastic/Heartbeat for monitoring and alerting of devices when they go down and come back up.
I've had a lot of success with Heartbeat for many different use cases, but in this case, I need to achieve alerting behavior from heartbeat that I'm having a bit of a problem implementing.
This is the required functionality I've been asked to replicate:
Heartbeat performs ICMP requests to each monitored device (Easy enough)
in watcher, If monitor.status is "down" but was "up" in the last poll, send an alert
in watcher, If monitor.status is up and the last poll was down, send an email stating that the device is now back up.
I've written an application that essentially does this for other environments, but the person requesting this behavior is determined to use Heartbeat and get this functionality out of it.
Does anyone have experiencing writing a Watcher alert that can accomplish this? I don't expect anyone to write this alert for me, I'd just like to know if anyone has done this in the past, and if it's worth it to add somewhat complex logic to a watcher alert instead of using a simple polling/emailing python script that uses Elastic as a back-end.
I sincerely appreciate any advice!
Edit: Just to throw out there, I've heard of this behavior being implemented by creating a watcher alert for a single host, in this case I'm monitoring thousands of devices so those solutions won't cut it =[