Is there anyway I can using the results obtained from aggregations to filter out the final hits in the query?
I want to obtain a list of users who have more than 2 devices and the list of their devices in the database. The device count can be done using aggregations, however I'm having a hard time trying to figure out how to use that results to apply on the final hits.
I thought about using post_filter but it didn't seem to work.
The equivalence of SQL would be something like this:
WITH
device_count AS (
SELECT
user,
COUNT(device_id) nb_device
FROM table
GROUP BY user
HAVING COUNT(device_id) >= 2
)
SELECT
table.user,
table.device
FROM table
JOIN device_count ON device_count.user = table.user
Thanks for your support. I just have 2 more following questions regarding this:
When I set the size = 10000, an error TransportError: TransportError(503, 'search_phase_execution_exception') was raised. The maximum size I could use was 2000. How can I increase the size?
Is there a way to know how many aggregated results will be returned? How can I paginate if I have more than 10000 (or 2000 in my case) in size?
Searching Google hasn't given me any satisfying answers yet...
See Size section in the terms aggregation doc. The default value of search.max_buckets is 65,536. Take care that someone set the value to 2,000 with some intention.
As described in the doc, you can use composite aggregation to pagenate on the aggregation result.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.