We have a cluster that serves data similar to that shown below:
{
"identifier" : "A"
"sequenceNumber" : 1
}
{
"identifier" : "A"
"sequenceNumber" : 2
}
{
"identifier" : "A"
"sequenceNumber" : 4
}
{
"identifier" : "A"
"sequenceNumber" : 8
}
What we would like to do is create a service that when given an identifier will return back any gaps. For instance if someone were to give us identifier A this service would return something similar to the following:
"gaps" : [
{
"lowerBound" : 2,
"upperBound" : 4
},
{
"lowerBound" : 4,
"upperBound" : 8
}
]
Original thought on how to accomplish this would be to get the maximum for a particular identifier and then use the range aggregations queries to see which sections had gaps. For instance for the above dataset the maximum would return back 8 and then the first round of range aggregations queries would looks similar to the following:
{
"aggs" : {
"identifier" : {
"range" : {
"field" : "sequenceNumber",
"ranges" : [
{ "to" : 4 },
{ "from" : 5}
]
}
}
}
}
"aggregations": {
"identifier" : {
"buckets": {
"*-4": {
"to": 4,
"doc_count": 3
},
"5-*": {
"from": 5,
"doc_count": 1
}
}
}
}
}
Recursively calling each range that returned less documents the expected would then eventually reveal the gaps. My question is there any easier way to do this with ElasticSearch?