Having clause equivalent


(Gil) #1

hi everyone,
i am new to elasticsearch.
i am trying to do something like a keep alive track from a service, it sends every X minutes a new record to ES.
i want to query all the servers that did not send this message in the last Y minutes.
(this sounds to me like a group by + having clause equivalent but maybe it can be achieved otherwise)

i would appreciate some help.
thanks.


(Christoph) #2

Assuming that your documents have some structure like this:

{
    "serviceId" : "someServiceA",
    "lastPing" : "2015-11-11T18:19:28"
}

and you can make sure that each service has send at least one keep alive doc in the past, you could do a Terms Aggragation for grouping all the docs of one service and then get the doc with the last ping timestamp with a Max Aggregation like so:

GET /test/keepAlive/_search?search_type=count
{
  "aggs": {
    "services": {
      "terms": {
        "field": "serviceId"
      },
      "aggs": {
        "last": {
          "max": {
            "field": "lastPing"
          }
        }
      }
    }
  }
}

The result should contain buckets for each service looking something like this, which you can then go through to find out which services havend responded in the last X minutes or so:

            {
               "key": "someServiceA",
               "doc_count": 3236,
               "last": {
                  "value": 1447265968000,
                  "value_as_string": "2015-11-11T18:19:28.000Z"
               }
            }

(Gil) #3

hi,
thanks for the response.
i have another question.
can i check if the server sent me two messages in the past x minutes?
(i guess i could use min on the last ping and max on the last ping and compare them) but can this be achieved in a different way?
as i understand this is a "having clause" that is not supported in elastic search.
thanks


(Christoph) #4

Hi Gil,
you could just filter on all docs from the last x minutes with a range filter and then do a terms aggregation like the one mentioned above. That would give you the number of docs per service in that time range. You'd still have to go through the result set and filter out the ones that are below a certain threshhold though. This kind of filtering on aggregation results seems something the new Bucket Selector Aggregation (the new pipeline aggregations coming in 2.0) might be good for, but I haven't tried that one myself yet.


(system) #5