My cluster fires a kibana alert when there is no metricbeat data for a server for the last 15 minutes, which is very useful in the case where there is an issue with the beat/server.
However, when a server is decommissioned, the alerts continue to fire unnecessarily.
I have added queries to the kibana alert to exclude the host.name of the decommissioned server(s), but in a large estate, this is not practical and I think it is taxing on Kibana alerting when saving the updated rule.
Is there any way to manage this better, so that I don’t receive alerts for decommissioned servers. I do not want the alerts generated at all (no email, not indexed and not logged)
I should clarify that I do only get 1 alert (on status change) however the alert index continues to be populated with the alert being active. Therefore it appears in the active alerts dashboard I have configured, which is monitored.
There is no apparent way to distinguish being a genuine issue where a server should be reporting in vs one that is decommissioned and expected not to report in.
The current work around I have is to give a tag to the server eg “decommissioned”, before it is shutdown, so that when kibana alerts, I can filter out by the tag “decommissioned”, but I wonder is there a better way.
Agreed - the alert is technically still active, which makes total sense.
The alert is a kibana “rule”, checking every minute if the condition is below 1 is met for the metricbeat-* indices/data streams in the last hour, grouped by host.name. “Alert me if a group stops reporting in data” is checked - which we want for general server health issues.
The rule checks every 3 minutes and alerts on status change.
The rule sends an email and uses the alert index connector (again on status change. I do not have the “if alert matches a query or “if alert is generated during time frame” options checked for the connector.
The dashboard is a custom one built with visuals using the index pattern “.alerts* - I think you’ve reminded me that this is going to be a problem with regard to the use of system indexes - perhaps this is what I should focus on first?!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.