Ok, this seems easy from a relational perspective, but I'm becoming unable to achieve it on ES:
I have an index where, among others, there is a numeric field called year. I want to filter my index data, retrieving only those documents which year is equal to the maximum value of year whitin the whole index.
In a relational database using SQL you'd use something more or less like this:
select * from index where year = (select max(year) from index);
On ES I know that I can get my maximum year value through an aggregation:
However, I'm being unable to use that maxYear calculated value on the same query. I've been trying to use different pipelines and aggregation combinations to no success.
Also I thought to store my maximum value on a runtime field, but now know really how to obtain that aggregated value on a runtime field.
Any insight on the matter will be much appreciated.
Indeed, your proposal seems to retrieve the expected data. The only problem here is that top_hits force to put a limit to the number of elements retrieved.
I need to obtain all possible results. There is any way I can return all results without a top_hits size?
You are correct. I believe then we need to use 2 step approach of finding the max year & using that to fetch all the records.
I see we can increase the docs as part of top_hits from 100 to greater value but that is risky & not recommended in Production which will impact the performance/cluster.
Hello @Tortoise, sorry for leting pass so much time without a response, but I've been abroad and hadn't so much time.
I'm not sure I'm getting what you refer as a 2 step approach. AFAIK, in elastic, is not possible to execute a query on the results of a previous one.
Do you mean to retrieve maximum year on a query and later, with a different query, filter data by the retrieved value? If that so, I'm not very sure this approach will work for my use case...
This is not the solution but i see there is one more way where we can fetch the records via Watcher Chain Input, this generally will be used for alerting (if that is the usecase you are looking for) where we want to pass dynamic value as input from output of previous query :
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.