Machine learning typical value not set at the moment of firing watcher

Josip_Cagalj · July 1, 2020, 10:46am

Hi,
I'm on Elastic Cloud v7.7 and I'm exploring the machine learning features.
I have a simple job for monitoring the counts of requests our API endpoint receives over time. The goal is to get alerted when an anomaly occurs. For that, I've created the watcher to send an email on those occasions. My watcher is triggering every 15 min and searches for records with record_score greater than 50.
The problem I'm facing is that in my emails I receive, the typical value from the record result is always 0? If I look up the anomaly in Kibana the typical field has a value set (not 0), even if I trigger my watcher by hand (with adjusted time range to get the right time bucket) I get the value.
Here is the screenshot from the original email:

And here is the picture from an email when I triggered the watcher by hand (7 days after the incident):

As you can see now I'm getting the value for the typical field?!
My question is there some kind of time span (threshold) when the typical value is calculated/pulled for that time bucket?
In my watcher, I'm targeting tie range 'from 17 min ago until 2 min ago' (15 min span with 2 min threshold).
Here are the important parts of my watcher:

   {
  "trigger": {
    "schedule": {
      "hourly": {
        "minute": [2, 17, 32,  47]
      }
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          ".ml-anomalies-*"
        ],
        "rest_total_hits_as_int": true,
        "body": {
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "result_type": "record"
                  }
                },
                {
                  "term": {
                    "job_id": "{{ctx.metadata.job_id}}"
                  }
                },
                {
                  "range": {
                    "timestamp": {
                      "gte": "now-{{ctx.metadata.window_period}}-{{ctx.metadata.buffer_period}}",
                      "lte": "now-{{ctx.metadata.buffer_period}}"
                    }
                  }
                },
                {
                  "range": {
                    "record_score": {
                      "gte": "{{ctx.metadata.min_record_score}}"
                    }
                  }
                }
              ]
            }
          },
          "sort": [
            {
              "record_score": {
                "order": "desc"
              }
            }
          ]
        }
      }
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.hits.total": {
        "gt": 0
      }
    }
  },
  "actions": {
    "send_email": { ... }
    }
  },
  "metadata": {
    "min_record_score": 50,
    "buffer_period": "2m",
    "window_period": "15m",
    "job_id": "my_tracker_casa"
  },
  "transform": {
    "script": {
      "id": "transform_ml_watcher_payload"
    }
  }
}

Thanks in advance for any clarification!

spinscale · July 1, 2020, 1:08pm

I cannot see how you are accessing your data structure, so this is impossible to tell.

one thing to keep in mind is, that a transform completely replaces your old payload. So if your transform does not include the original payload data, it won't be there.

richcollier · July 1, 2020, 6:08pm

ML writes the typical value into the index right away and it is not modified or re-written at a later time.

There must be something else going on here.

Do keep in mind, however, that documents in the .ml-anomalies-* indices are written with a timestamp that is the leading-edge of the bucket_span. So, if your job has a bucket_span of 30m and the current time right now is 12:27pm - then the timestamp on the record for the current bucket is 12:00pm. You'll need to keep in mind as Watcher searches for results.

Josip_Cagalj · July 3, 2020, 11:47am

The thing that bothers me is how is it that the same watcher (same transformation, same all but the time range in the query part) yields different results. Watcher triggered by scheduler seems not to have the typical field value and the one I triggered by hand (with modified time range to catch the desired time bucket) has typical value set?
And this is not an isolated incident. It happens always. For example, take a look at the graph below:

I've got an email (from watcher) for all of the incidents from the picture and every single one had typical value = 0. When I run the watcher by hand (later) I've got a completely different report with typical values as expected.
All these drives me away from thinking transformations are to blame, but here is how I access typical value inside transform script:

def typ = record._source.typical[0];

and later on, when I need the string representation (for placing it into the table cell) I'm doing:

new java.text.DecimalFormat('#').format(typ)

richcollier · July 6, 2020, 3:12pm

I think the prudent situation is to submit the contents of your entire Watch with a support ticket.

system · August 3, 2020, 3:12pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Machine learning typical value not set at the moment of firing watcher Part 2 Elasticsearch elastic-stack-machine-learning , elastic-stack-alerting	7	683	October 2, 2020
Machine learning "actual" value doesn't match values displayed in a vizualization Elasticsearch elastic-stack-machine-learning	8	719	January 13, 2021
Machine Learning module is triggering alerts when there is no anomaly Elasticsearch elastic-stack-machine-learning	27	2902	July 1, 2019
Machine learning Alerts are failing for java error Elasticsearch elastic-stack-machine-learning	11	1021	October 30, 2018
Watcher email body message to send some parameters in anomaly explorer Kibana elastic-stack-monitoring , elastic-stack-machine-learning	7	1380	January 30, 2020

Machine learning typical value not set at the moment of firing watcher

Related topics