Filter index by max field value

Ok, this seems easy from a relational perspective, but I'm becoming unable to achieve it on ES:

I have an index where, among others, there is a numeric field called year. I want to filter my index data, retrieving only those documents which year is equal to the maximum value of year whitin the whole index.

In a relational database using SQL you'd use something more or less like this:

select * from index where year = (select max(year) from index);

On ES I know that I can get my maximum year value through an aggregation:

{
  "aggs": {
    "maxYear": {
      "max": {
        "field": "year"
      }
    }
  }

However, I'm being unable to use that maxYear calculated value on the same query. I've been trying to use different pipelines and aggregation combinations to no success.

Also I thought to store my maximum value on a runtime field, but now know really how to obtain that aggregated value on a runtime field.

Any insight on the matter will be much appreciated.

Hello @PMF

What if we sort the bucket with size 1 & only get records for that bucket?

{
  "size": 0,
  "aggs": {
    "year_buckets": {
      "terms": {
        "field": "year",
        "order": { "_key": "desc" },
        "size": 1
      },
      "aggs": {
        "top_docs": {
          "top_hits": {
            "size": 100
          }
        }
      }
    }
  }
}

Thanks!!

Hi @Tortoise , thank you for your response.

Indeed, your proposal seems to retrieve the expected data. The only problem here is that top_hits force to put a limit to the number of elements retrieved.

I need to obtain all possible results. There is any way I can return all results without a top_hits size?

Hello @PMF

You are correct. I believe then we need to use 2 step approach of finding the max year & using that to fetch all the records.

I see we can increase the docs as part of top_hits from 100 to greater value but that is risky & not recommended in Production which will impact the performance/cluster.

Thanks!!