We have Filebeat, Metricbeat, Packetbeat, and Auditbeat all enabled across multiple servers.
I am stumped as to how to set up a watcher that alerts me if a beat in a specific server goes down.
I've looked at Central Management, Heartbeat and looked at the .monitoring-* index. But without any luck.
Central Management doesn't seem to have an alert feature.
Heartbeat only works for icmp, http, or tcp.
the .monitoring-* index has beat data but if the beat is down I won't be receiving data from it. An idea I had is to query Filebeat monitoring data and if hit is 0 alert me.
But I am not sure if that's the best option or if it's practical.
You could setup a watch that checks for the existence of data in .monitoring-beats-* in the past X duration. If it's not there, then it's possible the beat is down. There could be other reasons for the non-existence of data in this index (e.g. beat is up but there's a network issue somewhere along the path that gets the monitoring data from the beat into .monitoring-beats-*) so you might get some false positives.
I had a little think about this. How would I set up the query for that? and how would alerting work?
Is it a case of of checking there are records for mericbeat, filebeat, auditbeat and packetbeat on server1 and repeating that for every server? Because then the alerting wouldn't be specific. Ideally I would like an alert that says, filebeat for server1 is down.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.