Return newest document per distinct value field

Hi,

I have an index composed of the same objects (documents). For simplicity, let's just say each object has three fields:

  1. timestamp
  2. recordLocator
  3. Value

The recordLocator field may be common to several documents. However, the timestamp will be different.

I would like to return the newest document (sorting by timestamp) for each distinct recordLocator. How can this be accomplished?

Carlos

Hi,

This will do the trick:

aggs <- '{
"query": {
"match_all": {}
},
"aggs": {
"grouped_by_name": {
"terms": {
"field": "recordLocator",
"size": 0
},
"aggs": {
"top_tag_hits": {
"top_hits": {
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}'

I get the feeling it is a bit slow though as it takes about 15 seconds and my index "only" has about 200 million records.