Elasticsearch query to return the most recent of 'each document' based on a condition

0

I am trying to retrieve the most recent version of each document in my dataset when the document is not already archived ( archived: false ). So when any version of the document has archived set to true, it should not appear in my result.

An example of my dataset:

 {
    name: "soccer game",
    base_id: 1,
    hours_remaining: 10,
    updatedDate: 2019 - 03 - 10,
    archived: false
}

{
    name: "basketball game",
    base_id: 2,
    hours_remaining: 20,
    updatedDate: 2019 - 03 - 10,
    archived: false
}

{
    name: "soccer game",
    base_id: 1,
    hours_remaining: 5,
    updatedDate: 2019 - 03 - 14,
    archived: true
}

The expected result is :

{
    name: 'basketball game",
    base_id: 2,
    hours_remaining: 20,
    timestamp: 2019 - 03 - 10,
    archived: false
}

After writing several queries, I haven't been able to achieve my goal. This is one of my attempts.

{
  "size": 10, 
   "query":{"bool":{"must":[{"query_string":{"query":"*","fields":["name.keyword"]}},{"term":{"archived":false}}]}},
   "collapse": {
    "field": "base_id",
    "inner_hits": {
      "name": "most_recent",
      "size": 1,
      "sort": [{"updatedDate": "desc"}]
    }
  }
}

What am I doing wrong?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.