Strange Behaviour in alerts

Hi,
I'm starting to migrate our old nagios into the Elastic's Observability solution.
When trying to do a simple Infrastructure alert if happens something strange.
I define a CPU alert with a very low threshold (in order to test the alert), it seems have data that is matching the condition but when testing the alert it says that

I'm running version 7.14.1.
In other ES instance I have (version 7.12.1) with the same data works well

Thank you
Regards

If you look closely at the graph each bar is 1 minute so "FOR THE LAST 1 minute" as you defined the bar is below the threshold so therefore it's 0.

If you would have bumped that up to say last 5 minutes you You would expect.

Hi @stephenb , same result if I put 5 minutes


In this last graph each bar represents aggregated results in time bucket of 5 minutes
image
image

and clearly the result of the agg in 5 minutes meets the threshold...
(same as qhen I put 1 minute)

What Am I missing here?

Regards

Interesting.... And yes you are right you have seen in the first case as well because it tests across the whole hour.

Interesting Here is 7.14.0 on Mine, looks good, I am upgrading to 7.14.1 I will check.

Also @anyla just to help me reproduce which menu / screen did you create this alert from?

Thank you @stephenb

Metrics -> Alerts and Rules -> Infraestructure -> Create inventory rule

(also tried from the Stack Management -> Rules and Connectors -> Create Rule -> Inventory with the same result)

Well good and bad news ... my 7.14.1 Works as well. See Below.

What are you collecting the metrics with? Metricbeat (what version) or Elastic Agent?

What is your "Check every" Set too?

Also have you created and action and see if it actually fires? (Independent of the graph)

Yes, I did the same experiment with another instance I have with 7.14 and updated to 7.14.1 and same result as you...is working.
I'm collecting with metricbeat... I check and the data is in the index, so I do not understand :frowning:

Check every is set to 1 minute

In the graphic the data and the thresholfd is shown but if I also check "Alert me if there's no data" It says

image

I'll continue trying in order to find out what is going wrong

Yes Odd I can not reproduce.

That 6 Results of no Data is is checking to see if there were data gaps and it is saying yes there are 6 instances of missing data across that hour.

Have you created and actual action and see if it Fires?

Have you made sure that you have completely reloaded Kibana in the Browser when you upgrade sometimes to clear the cache etc.

I did an infraestructure rule for Memory (is the average of system.memory.user.pct) and no results

I did the same rule as a Metric Threshold rule and works..

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.