Machine learning "actual" value doesn't match values displayed in a vizualization

In amonaly detection I get an actual value of "75.0" when I go to a visualization of the raw data I cant find that actual value, I was thinking that maybe the agregation was blurring the value, but I choose a smaller interval in the visualization to get the raw data and still is far from the actual, the max value shown in the visualization is 35

the visualization is TSVB and is using percent format, the actual value in the watcher is multiplied by 100

                      "actual_per": {
                        "script": {
                          "lang": "painless",
                          "source": """(doc["actual"].value) * 100"""
                        }
                      },

Cant find the value in discover too

values over 0.7 are from november and I got the alert with timestamp "2020-12-04T12:30:00.000Z"

another weird thing is that when I click on the link of the watcher mail to go to the anomaly explorer, it doesnt match the top records that comes in the mail

image

I cannot tell from your screenshots where you've confirmed that Anomaly Detection tells you that the actual value is 75.0.

Can you please find the record in .ml-anomalies-*that shows the actual value of 75.0 and share that here.

1 Like

Hi Rich, Thats one of the problems the actual value is not there when I search for it on Discover, but when execute the watch I get the value 0.75 which later in the script I tranform to 75 multiplying it by 100

the watcher is set to look for anomalies in 30d, for testing...that should be a problem right?

version 7.8 of elastic

The timestamp of 1606115700 is from November 23rd, not Dec 4th as you're showing in Discover.

image

Are you sure you're looking back that far in Discover? Also keep in mind that timestamp is in GMT and you may need to adjust for your timezone given that by default Kibana will show you times in your local time zone.

1 Like

Thanks for your answer Rich, but I dont understand, that is the range that the query returns in the bucket result, that should be the range on where the bucket anomaly was found, or Im wrong?
image

All that I was saying is that you showed me a screenshot of the .ml-anomalies-* index and it showed a record from Nov 23rd. Your screenshot of Discover showed Dec 4th. These are obviously two different dates and I wanted to make sure you realized that if you were looking for this specific anomaly record that you had the wrong date.

In your latest screenshot, you're showing a bucket result (result_type:bucket) with a timestamp of 1607085000 which corresponds to Friday, December 4, 2020 12:30:00 PM GMT, but bucket results don't list actual values, only result_type:record results do.

You now have me confused.

1 Like

Sorry Rich, English is not my native language

-Yes, the only actual value, in the last 30 days, equal to 0.75 is from Nov 23

-the range on the watcher is 30 days

image

-yes, bucket result doesnt list actual values, only results do, but the timestamp of the bucket is what defines the time showed on the mail alert

image

and also set the range of time on where the bucket anomaly ocurred

image

this range is used on the link of the mail alert to show the anomalies on the anomalies explorer

,time:(from:'{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0.fields.start.0}}',mode:absolute,to:'{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0.fields.end.0}}')

If the alerted record anomaly ocurred in Nov 23, why is included in a bucket anomaly which point to December 4?

I think I would need to see your entire Watcher configuration to know exactly what you're attempting to do, but I would think that putting in a range query for the last 30 days is not philosophically compatible with real-time alerting. Why look back so far in time?

Normally, you'd run the Watch every X minutes and look back in time roughly for a similar span of time (i.e. about X minutes as well, to avoid picking up the same alert conditions over and over again).

Setting the frequency of the Watch execution with respect to the look-back window of time is a little trickier with ML job results because of the fact that ML job results are written with a timestamp resolution of bucket_span and the value of the timestamp is the leading edge of the bucket. Therefore, there's a general rule that the look-back window of the watch should be about 2*bucket_span of the ML job so that no results are missed.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.