Uptime Alerts just not works

Hello,

I have successfully configured Heartbeat with Uptime and everything works well.

But I have a very strange issues with Kibana Alerts. See screenshots.

  1. In Uptime Alerts I can see and use only the "MATCHING MONITORS ARE DOWN" condition, no more. I see another filters, I mean I can use only this condition. For example no condition like "ALL MONITORS ARE DOWN".

  2. In Uptime I have no DOWN monitors for more than last 15 minutes, but Alert event is always active.

What I'm doing wrong? I just do not understand the situation. I tried a lot of tricks and settings, but no luck. Anyone have an idea?

I am using a simple KQL:

monitor.id:("monitor-1" or "monitor-2" or "monitor-3" or "monitor-4")

I'm running the latest Elastic Cloud 7.8.1

Hi @alex-rn, sorry we missed your post somehow.

Alert is always active means, you are getting the alert message via email/slack etc?

Active is with status Active when go inside Alert and yes, actions applied.
The issue seems with when I used more complex KQL. For example when I'm using the only one condition i.e.

tags:lxc

it is works just fine. But when I apply more complex KQL i.e.

monitor.id:("monitor-1" or "monitor-2" or "monitor-3" or "monitor-4")

alert status goes to Active like the monitored service is down.

But both queries applies to the same monitoring items. I've just added a tag for avoid this issue.

So if no monitor is down during the selected period and you are still getting the alert message. To me it seems like a bug. We are continuously improving this.

Will you mind opening an issue in https://github.com/elastic/kibana/issues/new?template=Bug_report.md

we will be happy to take a look at it asap.

Regards

Yes, this seems a bug. And one more thing - there is only one condition to select

MATCHING MONITORS ARE DOWN

No more choices.

Initially I just wanted to write about the bug on Github, but when I saw there a lot of things needed to be filled in form, I refused it. In addition, I am a Platinum client who is not a tester or developer or a beta software user. And in my opinion, such kind of customers can be protected from such time-consuming actions, since I simply do not have time for this. I wrote everything as I could here and I will be grateful if you inform your colleagues about this bug as expected.

Thank you!

@alex-rn what other options would you like to have? You can choose any query or you can also add filters from tag/port/type/location etc

In 7.9 we have also released availability checks.
Purpose of this alert is you get an alert when your monitor is down. You can select any group of monitors by tweaking query or filters.

Sure, we will take a look at the bug and see if we can reproduce this ourselves.

Regards

Good description of the functionality already described on your Github. But please do not forget about STATUS, so I can send Resolved with actions. This is also not possible now.

The most basic alert we have talked about defining can be expressed in plain English as,

WHEN "ANY" OR "ALL" MONITOR MATCHING [kql]
IS [status] MORE THAN [numtimes] times 
WITHIN [timerange]
FROM ANY [location option]

Thank you!

Can you please open an official support ticket as well? You mentioned that you're a platinum customer, which means you can go through our support organization instead of the forum.

I tried, but I'm unable to login Support Portal with my credentials that I'm using for login to Elastic.co. And the only way I found is this forum.

I have sent a support email. Case #00597755

@alex-rn we have PR up to resolve this issue. Apparently complex nested kuery are causing problems. This won't make it before 7.9.2 release. Since then you may have to rely on simple alert on everything by avoiding any nested filters.

There is also a dirty workaround, you can append this to your kuery bar condition
and monitor.status : "down" and @timestamp >= "now-15m" and @timestamp <= "now"

basically this filter is being overridden so you have to add it to make it work. Keep in mind now-15m is value from flyout you selected like "Monitor is down > 5 times within last 15mins " so if you have different condition, you have to change this, maybe for example if you select "WITHIN last 5 minutes". than this value will be "now-5m"

For example it will look like this

and also in flyout you may see the wrong value for

"This alert will apply to approximately 0 monitors." since it has a condition monitor.status down and you may not have any down monitor in current state, so you can ignore that.

Sorry for all the trouble.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.