Hi everyone,
I'm quite new to Elastic and not too familiar yet with query DSL in Elastic yet. I have an Elastic database with documents containing vehicle's position at various times. Due to bad reception some documents don't contain a location
field.
Here's the mapping:
PUT /agg_test
{
"settings" : {
"number_of_shards" : 1,
"number_of_replicas" : 1
},
"mappings": {
"_doc": {
"properties": {
"location": {
"type": "geo_point"
},
"time": {
"type": "date"
},
"vehicle": {
"type": "keyword"
}
}
}
}
}
and the data
POST /agg_test/_doc/_bulk?refresh
{"index":{"_id":1}}
{"location": "52.374081,4.912350", "time": "2018-01-01T13:00:00Z", "vehicle": "vcl_1"}
{"index":{"_id":2}}
{"name": "Museum Het Rembrandthuis","time": "2018-01-01T14:00:00Z", "vehicle": "vcl_1"}
{"index":{"_id":3}}
{"location": "52.371667,4.914722", "time": "2018-01-01T13:00:00Z", "vehicle": "vcl_2"}
{"index":{"_id":4}}
{"location": "51.222900,4.405200", "time": "2018-01-01T14:00:00Z", "vehicle": "vcl_2"}
{"index":{"_id":5}}
{"location": "48.861111,2.336389", "time": "2018-01-01T13:00:00Z", "vehicle": "vcl_3"}
{"index":{"_id":6}}
{"time": "2018-01-01T14:00:00Z", "vehicle": "vcl_3"}
I'm interested in finding the last known position of each vehicle (together with the time it was recorded) to use for visualisation. So I thought I could filter for the ones with location, then group by vehicle, get the maximum timestamp (or better the record with the maximal timestamp) and then ...
What I currently have is
GET /agg_test/_search
{
"query": {
"exists": { "field": "location"}
},
"aggs": {
"last_position": {
"terms": {
"field": "vehicle"
},
"aggs": {
"max_time": {
"max": {
"field": "time"
}
}
}
}
},
"size": 0
}
As you see there is no way forward, as there is nothing like a bucket aggregation constructing a bucket where some field is maximized. Alternatively I'd need a way to use the output from the query (vehicle together with timestamp) to use as input for another query to get the locations but of course I don't want to do this manually but would like one query to do all of that. Another thing I was looking for was a pipeline aggregation I could use for my task.
As far as I know, none of the things I described above can be written as query in Elastic.
What is the best practice for queries like the one I want to make in Elastic? Thank you!