Elasticsearch aggregation on object over time series


(bstsnail) #1

I have the following document in Elasticsearch
{"timestamp": 1457018658303, "location": {"lat":1, "lon":1}} {"timestamp": 1457018718303, "location": {"lat":2, "lon":2}} {"timestamp": 1457018778303, "location": {"lat":3, "lon":3}} {"timestamp": 1457018838303, "location": {"lat":4, "lon":4}} {"timestamp": 1457018898303, "location": {"lat":5, "lon":5}} {"timestamp": 1457018958303, "location": {"lat":6, "lon":6}}

and the mapping is:
{ "mappings": { "test": { "_all":{ "enabled": false }, "_source": { "enabled": false }, "properties": { "timestamp": { "type": "date", "format": "epoch_millis", "doc_values": true }, "location": { "properties": { "lat": {"type": "double", "index": "no", "doc_values": true}, "lon": {"type": "double", "index": "no", "doc_values": true} } } } }
Assume that the timestamp interval is 1 minute in the Elasticsearch. So if I want to do the date_histogram on timestamp and the interval is 5 minutes, and pick the first record of location every 5 minutes So the aggregation result should like this
{ "aggregations": { "buckets": [ { "timestamp":1457018658303, "location.lat":1, "location.lon":1 }, { "timestamp":1457018898303, "location.lat":5, "location.lon":5 }, } }

Can I aggregation like this ?


(David Pilato) #2

Try with a date_histogram agg and add within it a top_hits agg with size=1.

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html


(bstsnail) #3

Thanks dadoonet,
I have tried that already,
but as I have disable the _source field, so the result will be like this
"aggregations": { "result": { "buckets": [ { "key_as_string": "2016-02-27T15:20:00.000Z", "key": 1456586400000, "doc_count": 4, "hits": { "hits": { "total": 4, "max_score": 1, "hits": [ { "_index": "geoindex13", "_type": "geotype13", "_id": "998", "_score": 1 } ] } } }, ...
Just can get the index id, , can't get the location.lat and location.lon.


(David Pilato) #4

And did you store the lat and lon fields?

PS: you should disable source only if you have a very good reason doing it. You won't be able to use the new reindex API without the source for example.


(bstsnail) #5

Yes, I did.
The sample data like this:
{"timestamp": 1457018658303, "location": {"lat":1, "lon":1}} {"timestamp": 1457018718303, "location": {"lat":2, "lon":2}} {"timestamp": 1457018778303, "location": {"lat":3, "lon":3}} {"timestamp": 1457018838303, "location": {"lat":4, "lon":4}} {"timestamp": 1457018898303, "location": {"lat":5, "lon":5}} {"timestamp": 1457018958303, "location": {"lat":6, "lon":6}}
And I disable source because I want to store the time series data, and will not be reindex any more and just want to data to be aggregation.


(David Pilato) #6

May be script_fields can help?

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html


(system) #7