Let's have a records where we have some records for few devices, each record contains information about position, timestamp and deviceId:
[
{
deviceId: 1,
geoPoint: {
lat: 1, lon: 2
},
timestamp: '2018-01-01 12:00:00'
},
{
deviceId: 1,
geoPoint: {
lat: 3, lon: 4
},
timestamp: '2018-01-01 12:01:00'
},
{
deviceId: 1,
geoPoint: {
lat: 5, lon: 6
},
timestamp: '2018-01-01 12:02:00'
},
{
deviceId: 2,
geoPoint: {
lat: 1, lon: 2
},
timestamp: '2018-01-01 12:01:00'
},
{
deviceId: 2,
geoPoint: {
lat: 3, lon: 4
},
timestamp: '2018-01-01 12:02:00'
},
{
deviceId: 2,
geoPoint: {
lat: 5, lon: 6
},
timestamp: '2018-01-01 12:03:00'
}
];
Mapping is quite simple:
deviceId: {type: 'keyword'},
geoPoint: {
type: 'geo_point'
},
timestamp: {type: 'date', format: DateUtil.ES_FORMAT},
I would like to fetch clusters based on geohash_grid for last positions for distinct devices, quite simple, however query doesn't return correct results - instead it looks like doesn't take into account sub-aggregations.
Query:
{
size: 0,
aggs: {
clustering: {
geohash_grid: {
field: 'geoPoint',
precision: 2
},
aggs: {
last_device_position: {
terms: {
field: 'deviceId',
size: 1,
order: {
last_one: 'desc'
}
},
aggs: {
last_one: {
max: {
field: 'timestamp'
}
}
}
}
}
}
}
};
Results:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0,
"hits": []
},
"aggregations": {
"clustering": {
"buckets": [
{
"key": "u2",
"doc_count": 4,
"last_device_position": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 2,
"buckets": [
{
"key": "1",
"doc_count": 2,
"last_one": {
"value": 1514764860000,
"value_as_string": "2018-01-01 00:01:00"
}
}
]
}
},
{
"key": "t5",
"doc_count": 2,
"last_device_position": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 1,
"buckets": [
{
"key": "1",
"doc_count": 1,
"last_one": {
"value": 1514764920000,
"value_as_string": "2018-01-01 00:02:00"
}
}
]
}
}
]
}
}
}
Which gives incorrect results, only records with timestamp: 2018-01-01 00:02:00 should be taken into account.
I tried as well with:
geohash_grid: {
field: 'last_device_position > geoPoint',
precision: 2
},
or
geohash_grid: {
field: 'last_one > geoPoint',
precision: 2
}
But no results.
What is the right way to solve this issue ?