Get documents starting at a high from

NominaSumpta · March 3, 2022, 8:42am

Hi,

I have an with ~ 1.500.000 (million) documents or so. I only want to get 1.000 results from it, but I want to start counting backwards. So, I want to retrieve documents 1.499.000 (- 1.000) through 1.500.000. I've set 'from' to 1499000 and 'size' to 1000. from + size is therefore the total amount of documents: 1.500.000. This, expectedly, causes:

elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'Result window is too large, from + size must be less than or equal to: [10000] but was [1474810]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.')

The scroll API reference @ Scroll API | Elasticsearch Guide [7.17] | Elastic says:

We no longer recommend using the scroll API for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT).

Should I also be using PIT for retrieving just a few results, but starting at a high from?

spinscale · March 3, 2022, 9:50am

my gut feeling here is, that even though you could switch from a regular query to scroll search/PIT/search_after ,maybe the query itself could be improved? If you tell more about the use-case that might help.

Could you change the sorting strategy or filtering to retrieve the required documents instead of paginating through them?

NominaSumpta · March 3, 2022, 10:05am

Thanks for your reply.

The use case is as follows: I have a sort order and a limit. When the sort order is ascending, I want to inverse the limit. So:

If I have 10.000 documents, the sort order set to ascending and the limit set to 1, I will get document 9.999
If I have 10.000 documents, the sort order set to descending and the limit set to 1, I will get document 1

In other words: when the sort order is ascending, from is set to document count minus limit and size is set to the limit.

NominaSumpta · March 3, 2022, 11:37am

FWIW: when I refer to 'limit', I actually mean the Elasticsearch concept of 'size'.

casterQ · March 4, 2022, 7:37am

Your question is not very clear，Suppose you have 10000 documents：
if you want get last 100 docs(9900~10000) sort with asc，
can you use desc to sort and get Top100(1~100)？

NominaSumpta · March 4, 2022, 9:17am

Yes, I could, but they'd be in the wrong order.

E.g. when I want documents 7 and 8 (in that order):

DESC: 8, 7, 6, 5
ASC: 5, 6, 7, 8

DESC will give me the documents I need, 8 and 7, but in the wrong order.

I could of course drop the from, sort DESC and reverse() the results in Python ...

casterQ · March 4, 2022, 9:27am

Oh，I see，I may choose to get the docs and reverse it by myself

casterQ · March 4, 2022, 9:29am

Because it's expensive to use from+size ，and it's not appropriate to use scroll and PIT for your needs......

NominaSumpta · March 4, 2022, 9:52am

Thanks, that's what I thought (and why I asked the question ). I'll stick around for a bit to see if anyone else has any ideas.

NominaSumpta · March 5, 2022, 6:21pm

I could of course drop the from, sort DESC and reverse() the results in Python ...

I've solved it this way. Thanks for your replies, @casterQ and @spinscale!

system · April 2, 2022, 6:21pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Result window is too large, from + size must be less than or equal to: [10000] but was [11001] Elasticsearch	5	15497	July 5, 2017
Getting documents from an index with more than 10,000 records Elasticsearch	5	5119	March 12, 2021
Result window is too large, from + size must be less than or equal to: [10000] but was [10050] Elasticsearch	9	25396	April 23, 2018
Result window is too large Elasticsearch	2	9068	December 26, 2022
Get all documents from an index Elasticsearch	10	107522	June 21, 2017

Get documents starting at a high from

Related topics