In elasticsearch finding documents which meet a specific criteria for the latest for each group

I am trying to create a query in elasticsearch, which is able to retrieve the documents for each group, which is the latest document within each group and meet a specific criteria. But I have not been able to solve this one.

Say following documents are indexed in myindex in elasticsearch:

POST /myindex/_bulk
{ "index":{} }
{ "objid": 1, "ident":"group1","version":1, "chdate": 1, "field1" : 1}
{ "index":{} }
{ "objid": 2, "ident":"group1","version":2, "chdate": 2, "field1" : 0}
{ "index":{} }
{ "objid": 3, "ident":"group1","version":2, "chdate": 3, "field1" : 1}
{ "index":{} }
{ "objid": 4, "ident":"group1","version":2, "chdate": 4, "field1" : 0}
{ "index":{} }
{ "objid": 5, "ident":"group1","version":3, "chdate": 1, "field1" : 0}

I would like to find all documents, which has field1 set to x if the document with the highest chdate, for each ident and version, which has field1 set to x.

In a case where x is 0 then the documents, which has objid 4 and 5 should be returned In a case where x is 1 then the documents, which has objid 1 should be returned

{
  "size": 0,
  "aggs": {
    "by_ident": {
      "terms": {
        "field": "ident.keyword",
        "size": 10
      },
      "aggs": {
        "by_version": {
          "terms": {
            "field": "version",
            "size": 10000
          },
          "aggs": {
            "by_latest": {
              "top_hits": {
                "sort": [{
                  "chdate": {
                    "order": "desc"
                  }
                }], 
                "size": 1
              }
            }
          }
        }
      }
    }
  }
}

But I am unsure how I can add the criteria that field1 should be equal to x

Thanks in advance

Hi @m.a.tanaka

You can add the term query for filter documents with field1 equals 1 after aggs like this:

{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "field1": 1
          }
        }
      ]
    }
  }, 
  "aggs": {
    "by_ident": {
      "terms": {
        "field": "ident.keyword",
        "size": 10
      },
      "aggs": {
        "by_version": {
          "terms": {
            "field": "version",
            "size": 10000
          },
          "aggs": {
            "by_latest": {
              "top_hits": {
                "sort": [{
                  "chdate": {
                    "order": "desc"
                  }
                }], 
                "size": 1
              }
            }
          }
        }
      }
    }
  }
}

You can use a query without aggs. This example use Sort.

{
  "size": 10,
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "field1": 0
          }
        }
      ]
    }
  },
  "sort": [
    {
      "chdate": {
        "order": "desc"
      }
    }
  ]
}

Unfortunately it will not work because it will pick up the documents which has objid 1 and 3, however it should not pick up the one that has objid 3, as for that version (i.e. version 2) it should look at the document with objid 4 and not pick it up as that document has field1 set to 0

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.