I have an index that is append only. Every change gets saved as a new document, together with a date property. In order to get the current active data I need to group all entities by their ID and get the one with the latest date. It should be possible to add a filter for that date. I have been playing around with aggregations but I got stuck with the scroll API.
Data example (note: id's are guids, dates are unix timestamps, using other values here to make it clearer)
{"id": "<entityId_date>", "entityId", 1, "date": "2020-05-10", "value": { <JSON document> } },
{"id": "<entityId_date>", "entityId", 1, "date": "2020-06-10", "value": { <JSON document> } },
{"id": "<entityId_date>", "entityId", 2, "date": "2020-04-10", "value": { <JSON document> } },
{"id": "<entityId_date>", "entityId", 2, "date": "2020-08-10", "value": { <JSON document> } },
{"id": "<entityId_date>", "entityId", 3, "date": "2020-07-10", "value": { <JSON document> } }
Required results:
{"id": "<entityId_date>", "entityId", 1, "date": "2020-06-10", "value": { <JSON document> } },
{"id": "<entityId_date>", "entityId", 2, "date": "2020-08-10", "value": { <JSON document> } },
{"id": "<entityId_date>", "entityId", 3, "date": "2020-07-10", "value": { <JSON document> } }
Tried the following query:
{
"size": 0,
"aggs": {
"id_aggregate": {
"terms": {
"field": "entityId"
},
"aggs": {
"date_aggregate": {
"top_hits": {
"sort": [{
"date.ticks": {
"order": "desc"
}
}],
"size": 1
}
}
}
}
}
}
The result I get back is correct, but I only get the first X results back. I can set the size of the id_aggregate
, but there could be thousands of documents.
With a normal search query, I'd be able to use the Scroll API, but this is not possible with aggregations.
I have looked into using partitions, but that doesn't help me either, because sometimes there will be very few documents and sometimes there will be loads. (This is kind of a base query, all other queries we execute will have additional filters specified).
How can I build a query that fetches documents by their max date and is scrollable?