I'm with a team that is migrating our monitoring stack to open source. I've implemented some fantastic performance and service-specific monitoring using the ELK stack.
One key monitor I need to implement which isn't exactly straightforward is simple node Up/Down monitoring. I've got this working down to a service-specific level by using Nmap to write to a file which I tail with Filebeat
(e.g. nmap mapsmysql01 | grep mysql >> /var/log/alerts/mysqlup.log
)
However, I must intermittently check that both mysql is down and filebeat is up before throwing an alert.
Is there a more elegant way to monitor service or node up/down? I imagine I'm missing something here..