Hi,
I've got a Watch set up with Heartbeat input that sends an alert when any system is not responding to pings (i.e. has monitor.status == down) in the last X minutes.
Here is my watch at the moment:
{
"trigger": {
"schedule": {
"interval": "15m"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"heartbeat-*"
],
"types": [],
"body": {
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"monitor.status": {
"value": "down"
}
}
}
],
"filter": [
{
"range": {
"@timestamp": {
"from": "now-15m"
}
}
}
]
}
},
"aggregations": {
"by_monitors": {
"terms": {
"field": "monitor.host",
"size": 100,
"min_doc_count": 1
}
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"send_email": {
"email": {
"profile": "standard",
"to": [
"###@###.##"
],
"subject": "Unresponsive test systems",
"body": {
"html": "{{ctx.payload.hits.total}} system(s) not responding to pings:<P>{{#ctx.payload.aggregations.by_monitors.buckets}}{{key}}<BR>{{/ctx.payload.aggregations.by_monitors.buckets}}"
}
}
}
}
}
I'd like to change this so that it only triggers when a given system has monitor.status:down currently AND had monitor.status:up just before that (ex. perhaps now-30m)
How would that be done? Can a transform be used to search a second time using the monitor.host or monitor.id values returned by the above query to find the ones that has monitor.status:up earlier? Any examples or suggestions would be much appreciated.