Alert triggered without any buckets in aggregation

Hi,
I'm trying to build an alert that goes off when an instance of a service skipped 10 heartbeats within the last minute. I'm using the following adapted version of what I found in another post.
I am aggregating using a script, as there's two instances per monitor.name (service name) which can be separated by including observer.geo.name. Once the alert goes off, I receive a Slack message.

{
"trigger": {
  "schedule": {
    "interval": "1m"
  }
},
"input": {
  "search": {
    "request": {
      "search_type": "query_then_fetch",
      "indices": [
        "heartbeat-*"
      ],
      "rest_total_hits_as_int": true,
      "body": {
        "size": 0,
        "query": {
          "bool": {
            "must": [
              {
                "match": {
                  "monitor.status": {
                    "query": "down"
                  }
                }
              }
            ],
            "filter": [
              {
                "range": {
                  "@timestamp": {
                    "from": "now-1m"
                  }
                }
              }
            ]
          }
        },
        "aggregations": {
          "monitor_and_geo_names": {
            "terms": {
              "script": "doc['monitor.name'].value + ' at ' + doc['observer.geo.name'].value",
              "min_doc_count": 10,
              "order": {
                "_key": "asc"
              }
            }
          }
        }
      }
    }
  }
},
"condition": {
  "compare": {
    "ctx.payload.hits.total": {
      "gt": 0
    }
  }
},
"actions": {
  "notify-slack": {
    "throttle_period_in_millis": 600000,
    "slack": {
      "account": "monitoring",
      "message": {
        "from": "Health Alert",
        "text": "Missing heartbeats for following host(s):",
        "dynamic_attachments": {
          "list_path": "ctx.payload.aggregations.monitor_and_geo_names.buckets",
          "attachment_template": {
            "color": "warning",
            "title": "{{key}}",
            "text": "Missing heartbeats (last minute): {{doc_count}}"
          }
        }
      }
    }
  }
}
}

The problem is that the alert is triggered, even though the aggregation did not yield any buckets (e.g. an instance skipped just a single heartbeat in the last minute). The fact that I don't receive any attachements to my Slack message indicates, that there are no buckets. Why is the alert then triggered anyway? Am I missunderstanding how min_doc_count in the aggregation and ctx.payload.hits.total in my compare condition work together?

Ok, so after experimenting with the query a bit more, it seems like the count of total hits is the count of all matching documents regardless of any aggregation, i.e. min_doc_count in the aggregation does not affect it at all.
Then my question would be, how do I access the number of buckets for comparison in the condition section, so that the alert is only triggered if there are buckets?

Seems to work with a scripted condition like so:

"condition": {
    "script": {
      "source": "return ctx.payload.aggregations.monitor_and_geo_names.buckets.size() > 0",
      "lang": "painless"
    }
  }
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.