I have successfully configured Heartbeat with Uptime and everything works well.
But I have a very strange issues with Kibana Alerts. See screenshots.
In Uptime Alerts I can see and use only the "MATCHING MONITORS ARE DOWN" condition, no more. I see another filters, I mean I can use only this condition. For example no condition like "ALL MONITORS ARE DOWN".
In Uptime I have no DOWN monitors for more than last 15 minutes, but Alert event is always active.
Active is with status Active when go inside Alert and yes, actions applied.
The issue seems with when I used more complex KQL. For example when I'm using the only one condition i.e.
tags:lxc
it is works just fine. But when I apply more complex KQL i.e.
monitor.id:("monitor-1" or "monitor-2" or "monitor-3" or "monitor-4")
alert status goes to Active like the monitored service is down.
But both queries applies to the same monitoring items. I've just added a tag for avoid this issue.
So if no monitor is down during the selected period and you are still getting the alert message. To me it seems like a bug. We are continuously improving this.
Yes, this seems a bug. And one more thing - there is only one condition to select
MATCHING MONITORS ARE DOWN
No more choices.
Initially I just wanted to write about the bug on Github, but when I saw there a lot of things needed to be filled in form, I refused it. In addition, I am a Platinum client who is not a tester or developer or a beta software user. And in my opinion, such kind of customers can be protected from such time-consuming actions, since I simply do not have time for this. I wrote everything as I could here and I will be grateful if you inform your colleagues about this bug as expected.
@alex-rn what other options would you like to have? You can choose any query or you can also add filters from tag/port/type/location etc
In 7.9 we have also released availability checks.
Purpose of this alert is you get an alert when your monitor is down. You can select any group of monitors by tweaking query or filters.
Sure, we will take a look at the bug and see if we can reproduce this ourselves.
Good description of the functionality already described on your Github. But please do not forget about STATUS, so I can send Resolved with actions. This is also not possible now.
The most basic alert we have talked about defining can be expressed in plain English as,
WHEN "ANY" OR "ALL" MONITOR MATCHING [kql]
IS [status] MORE THAN [numtimes] times
WITHIN [timerange]
FROM ANY [location option]
Can you please open an official support ticket as well? You mentioned that you're a platinum customer, which means you can go through our support organization instead of the forum.
@alex-rn we have PR up to resolve this issue. Apparently complex nested kuery are causing problems. This won't make it before 7.9.2 release. Since then you may have to rely on simple alert on everything by avoiding any nested filters.
There is also a dirty workaround, you can append this to your kuery bar condition and monitor.status : "down" and @timestamp >= "now-15m" and @timestamp <= "now"
basically this filter is being overridden so you have to add it to make it work. Keep in mind now-15m is value from flyout you selected like "Monitor is down > 5 times within last 15mins " so if you have different condition, you have to change this, maybe for example if you select "WITHIN last 5 minutes". than this value will be "now-5m"
and also in flyout you may see the wrong value for
"This alert will apply to approximately 0 monitors." since it has a condition monitor.status down and you may not have any down monitor in current state, so you can ignore that.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.