I've also posted this question on StackOverflow (with a bounty) here. But here is the same question in the discussion board:
As far as I'm aware, there isn't a way to do something like the following in Elasticsearch:
SELECT * FROM myindex GROUP BY agg_field1, agg_field2, agg_field3 // aggregation ORDER BY order_field1, order_field2, order_field3 // sort LIMIT 1000, 5000 // paginate -- get page 6 of size 1000 records
Here are some related documents regarding this:
- Elasticsearch aggregation order by top hit score with partitions
Is there a way to do the above in Elasticsearch? The one limitation we have is we will never have more than 10M records, so we (hopefully) shouldn't run into memory errors. My thinking was to do it as follows:
- Do an aggregation query
- Get the number of results from it
- Split it into N segments based on the results and page size we want
- Rerun the query with the above segments
What would be the best way to accomplish this? In your answer/suggestion, could you please post some sample code relating to how the above SQL query could be done in ES?
As an update to this question, here is a public index to test with:
It has 10,000 records:
The query that I'm looking to do is as follows (in sql):
SELECT * FROM testindex GROUP BY store_url, status, title ORDER BY retailer_id DESC, status_errors ASC LIMIT 100 OFFSET 6000
In other words, I'm looking to sort an aggregated result (with multiple aggregations) and get an offset.