I recieve "alert-logs" from an monitoring-tool (Tibco Hawk)
There are the fields:
event[ALERT_RECIEVED, ALERT_CLEARED]
alert_id[unique ID for every alert]
more things like IP, Timestamp, ...
So i could get
{"event":"ALERT_RECIEVED", "alert_id":"001", ...}
{"event":"ALERT_RECIEVED", "alert_id":"002", ...}
{"event":"ALERT_RECIEVED", "alert_id":"001", ...}
{"event":"ALERT_RECIEVED", "alert_id":"003", ...}
{"event":"ALERT_CLEARED", "alert_id":"001", ...}
Now I want a metric to count all active (not cleared) alerts.
I must select all ALERT_RECIEVED and deselect all of them with the alert_ids of all ALERT_CLEARED
So the correct count for my example would be: 2(002, 003)
The most performant and scalable way to do this is possibly through a separate entity-centric alert index. Please see this post for further details with respect to a seemingly very similar scenario.
I have used this solution for another visualization but here I can't.
In my example you can see more than one alert can have the same ALERT_ID (because the same "Rule" caused the errors).
If I use the ALERT_ID as ID in Elasticsearch, I'm not abel to see all alerts.
And in want to see all alerts, also the allready cleared in a visualization which showas how many alerts I had in the last few hours/days (Line chart).
You can probably still solve this with an entity-centric index, but instead of just storing the latest document received, you may need to update the entity-centric document instead in order to keep track of the number of received and cleared alerts.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.