Strange Behaviour in alerts

anyla · September 7, 2021, 1:45pm

Hi,
I'm starting to migrate our old nagios into the Elastic's Observability solution.
When trying to do a simple Infrastructure alert if happens something strange.
I define a CPU alert with a very low threshold (in order to test the alert), it seems have data that is matching the condition but when testing the alert it says that

I'm running version 7.14.1.
In other ES instance I have (version 7.12.1) with the same data works well

Thank you
Regards

stephenb · September 7, 2021, 3:01pm

If you look closely at the graph each bar is 1 minute so "FOR THE LAST 1 minute" as you defined the bar is below the threshold so therefore it's 0.

If you would have bumped that up to say last 5 minutes you You would expect.

anyla · September 7, 2021, 3:14pm

Hi @stephenb , same result if I put 5 minutes

In this last graph each bar represents aggregated results in time bucket of 5 minutes

and clearly the result of the agg in 5 minutes meets the threshold...
(same as qhen I put 1 minute)

What Am I missing here?

Regards

stephenb · September 7, 2021, 3:35pm

Interesting.... And yes you are right you have seen in the first case as well because it tests across the whole hour.

Interesting Here is 7.14.0 on Mine, looks good, I am upgrading to 7.14.1 I will check.

Also @anyla just to help me reproduce which menu / screen did you create this alert from?

anyla · September 7, 2021, 3:47pm

Thank you @stephenb

Metrics -> Alerts and Rules -> Infraestructure -> Create inventory rule

(also tried from the Stack Management -> Rules and Connectors -> Create Rule -> Inventory with the same result)

stephenb · September 7, 2021, 4:04pm

Well good and bad news ... my 7.14.1 Works as well. See Below.

What are you collecting the metrics with? Metricbeat (what version) or Elastic Agent?

What is your "Check every" Set too?

Also have you created and action and see if it actually fires? (Independent of the graph)

anyla · September 7, 2021, 4:15pm

Yes, I did the same experiment with another instance I have with 7.14 and updated to 7.14.1 and same result as you...is working.
I'm collecting with metricbeat... I check and the data is in the index, so I do not understand

Check every is set to 1 minute

In the graphic the data and the thresholfd is shown but if I also check "Alert me if there's no data" It says

I'll continue trying in order to find out what is going wrong

stephenb · September 7, 2021, 4:34pm

Yes Odd I can not reproduce.

That 6 Results of no Data is is checking to see if there were data gaps and it is saying yes there are 6 instances of missing data across that hour.

Have you created and actual action and see if it Fires?

Have you made sure that you have completely reloaded Kibana in the Browser when you upgrade sometimes to clear the cache etc.

anyla · September 7, 2021, 4:34pm

I did an infraestructure rule for Memory (is the average of system.memory.user.pct) and no results

I did the same rule as a Metric Threshold rule and works..

system · October 5, 2021, 4:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elastic Alert Kibana elastic-stack-alerting	8	377	January 21, 2023
Doubts about Kibana Rules and conections alerts Kibana elastic-stack-alerting	2	219	June 23, 2023
CPU usage alert reason is incorrect Metrics	5	574	November 9, 2021
Stopped Metrics Make Alert Get Recovered Metrics elastic-stack-alerting	0	83	May 24, 2024
Alert not being triggered for Uptime duration anomaly rule Elastic Observability	7	300	November 4, 2022

Strange Behaviour in alerts

Related topics