Hello folks, I'm hoping to get a little insight from experienced folks as to how to approach my problem.
I have a bunch of devices creating documents in my index. Each document is identified as belonging to a particular device with the hostname.keyword field. All devices create a document every 5 minutes.
What I'm trying to do is create something that will watch this index and send an alarm if any given device stops sending data, and another alarm when it starts sending data again after stopping.
I've come up with the following query, this creates a top-level aggregation per hostname, followed by sub-aggregations for 2 time periods. By comparing the doc_counts of these sub-aggregations, I should be able to determine if something went down or came back up.
POST my-index/_search?size=0
{
"query": {
"bool": {
"filter": [
{
"match": {
"webhook": "heartbeat"
}
},
{
"range": {
"@timestamp": {
"gte": "now-25m",
"lte": "now-5m"
}
}
}
]
}
},
"aggs": {
"hostname": {
"terms": {
"field": "hostname.keyword"
},
"aggs": {
"beforeChunk": {
"filter": {
"range": {
"@timestamp": {
"gte": "now-25m",
"lte": "now-15m"
}
}
}
},
"afterChunk": {
"filter": {
"range": {
"@timestamp": {
"gte": "now-15m",
"lte": "now-5m"
}
}
}
}
}
}
}
}
Here's the relevant part of the response (truncated down to two devices)
"aggregations": {
"hostname": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "hostname-1",
"doc_count": 4,
"beforeChunk": {
"doc_count": 2
},
"afterChunk": {
"doc_count": 2
}
},
{
"key": "hostname-2",
"doc_count": 4,
"beforeChunk": {
"doc_count": 2
},
"afterChunk": {
"doc_count": 2
}
}
]
}
}
I suppose I will need to do this in a watcher, but I'm honestly not sure where to begin. I am still pretty new to Elastic. Any guidance you could provide would be appreciated.