Detecting gaps in numeric range using ElasticSearch

Leinad0033 · September 11, 2017, 11:17pm

We have a cluster that serves data similar to that shown below:

{
    "identifier" : "A"
    "sequenceNumber" : 1
}

{
    "identifier" : "A"
    "sequenceNumber" : 2
}

{
    "identifier" : "A"
    "sequenceNumber" : 4
}

{
    "identifier" : "A"
    "sequenceNumber" : 8
}

What we would like to do is create a service that when given an identifier will return back any gaps. For instance if someone were to give us identifier A this service would return something similar to the following:

"gaps"  :  [
   {
      "lowerBound" : 2,
      "upperBound" : 4
  },
 {
      "lowerBound" : 4,
      "upperBound" : 8
  }
]

Original thought on how to accomplish this would be to get the maximum for a particular identifier and then use the range aggregations queries to see which sections had gaps. For instance for the above dataset the maximum would return back 8 and then the first round of range aggregations queries would looks similar to the following:

{
    "aggs" : {
        "identifier" : {
            "range" : {
                "field" : "sequenceNumber",
                "ranges" : [
                    { "to" : 4 },
                    { "from" : 5}
                ]
            }
        }
    }
}

"aggregations": {
        "identifier" : {
            "buckets": {
                "*-4": {
                    "to": 4,
                    "doc_count": 3
                },
                "5-*": {
                    "from": 5,
                    "doc_count": 1
                }
            }
        }
    }
}

Recursively calling each range that returned less documents the expected would then eventually reveal the gaps. My question is there any easier way to do this with ElasticSearch?

system · October 9, 2017, 11:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Efficiently detecting gaps in a numeric range using Elasticsearch Elasticsearch	1	448	May 27, 2019
Elasticsearch Range Aggregation is including upper limit excluding lower limit Elasticsearch	1	523	March 26, 2020
Find range of ID numbers Elasticsearch	1	377	February 20, 2018
Cardinality aggregation with set up time range Elasticsearch	5	611	August 30, 2019
String range queries, non-identical lengths Elasticsearch	1	296	March 23, 2021

Detecting gaps in numeric range using ElasticSearch

Related topics