Alerting on forecast

Hi, im trying to alert when the forecast prediction value is greater than 95, I need to get the hostname (partition_value) and the time when the value surpass the treshold of 95, I currently can get the value and the partition_field_value, but I cant get the time.

GET .ml-anomalies-custom-prediction*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": {
        "range": {
          "timestamp": {
            "gte": "now",
            "lte": "now+4d",
            "format": "strict_date_optional_time||epoch_millis"
          }
        }
      }
    }
  },
  "aggs": {
    "hostname": {
      "terms": {
        "field": "partition_field_value"
      },
      "aggs": {
        "metricAgg": {
          "avg": {
            "field": "forecast_prediction"
          }
        }
      }
    }
  }
}

this is the response of the query:

"aggregations" : {
    "hostname" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 684288,
      "buckets" : [
        {
          "key" : "SitRecove",
          "doc_count" : 1152,
          "metricAgg" : {
            "value" : 6.627173770333747
          }
        },
        {
          "key" : "RACCO",
          "doc_count" : 1152,
          "metricAgg" : {
            "value" : 1.1092647238975397
          }
        }

I know that the time field in a ML index is "timestamp" but I dont know how to agreggate it ...... how I can get the time?

Did you tried min on timestamp to get the start?

However in the description you say:

This does not correspond to the search you do. If you want the exact timestamp you should not build an average but e.g. use filter with a range query (>95). As sub aggregation of the filter I would use min to get the start timestamp.

I think the condition takes care of that

 "condition": {
    "script": {
      "source": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;",
      "params": {
        "threshold": 95
      }
    }

The query is based on the one that the GUI creates, so I asume is the best way
image

going back to my questionn, i just want to get the time, I have tried nesting the agregation in the hostname agg, but you cannot nest another agg when you use AVG, i have tried the time agg next to the hostname agg, but the response is confusing, and the timestamp is apart from the other data (partition_field_value and forecast_prediction)

Why do you want to use average in the first place? In your requirement you say:

Think about it, when you have an average, it's reduced to 1 value, hence there aren't any timestamps to choose from, but just 1.

Right now, you fire the alert when the average of the next 4 days is above 95. This does not fit the requirement you set.

Any suggestion on the type of query,?

This query should return all hosts that breach the limit.

Note: It reports all hosts, but only the ones where timestamp != null breach the threshold in the next 4 days.

GET .ml-anomalies-custom-prediction*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": {
        "range": {
          "timestamp": {
            "gte": "now",
            "lte": "now+4d",
            "format": "strict_date_optional_time||epoch_millis"
          }
        }
      }
    }
  },
  "aggs": {
    "hostname": {
      "terms": {
        "field": "partition_field_value"
      },
      "aggs": {
         "overLimit": {
          "filter": {
            "range": {
              "forecast_prediction": {
                "gte": 95
              }
            }
          },
          "aggs": {
            "timestamp": {
              "min": {
                "field": "timestamp"
              }
            }
          }
        }
      }
    }
  }
}
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.