Detecting gaps in numeric range using ElasticSearch

We have a cluster that serves data similar to that shown below:

{
    "identifier" : "A"
    "sequenceNumber" : 1
}

{
    "identifier" : "A"
    "sequenceNumber" : 2
}

{
    "identifier" : "A"
    "sequenceNumber" : 4
}

{
    "identifier" : "A"
    "sequenceNumber" : 8
}

What we would like to do is create a service that when given an identifier will return back any gaps. For instance if someone were to give us identifier A this service would return something similar to the following:

"gaps"  :  [
   {
      "lowerBound" : 2,
      "upperBound" : 4
  },
 {
      "lowerBound" : 4,
      "upperBound" : 8
  }
]

Original thought on how to accomplish this would be to get the maximum for a particular identifier and then use the range aggregations queries to see which sections had gaps. For instance for the above dataset the maximum would return back 8 and then the first round of range aggregations queries would looks similar to the following:

{
    "aggs" : {
        "identifier" : {
            "range" : {
                "field" : "sequenceNumber",
                "ranges" : [
                    { "to" : 4 },
                    { "from" : 5}
                ]
            }
        }
    }
}

"aggregations": {
        "identifier" : {
            "buckets": {
                "*-4": {
                    "to": 4,
                    "doc_count": 3
                },
                "5-*": {
                    "from": 5,
                    "doc_count": 1
                }
            }
        }
    }
}

Recursively calling each range that returned less documents the expected would then eventually reveal the gaps. My question is there any easier way to do this with ElasticSearch?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.