Index max_result_window

bevans88 · January 5, 2016, 9:57am

Hi,

After upgrading to ES 2.1 I've noticed that the max_result_window now defaults to 10000 and throws an exception if it's exceeded. I understand the reasons behind this and was wondering if there is any other way to perform the following (without increasing the max_result_window setting);

Perform a search across a set of 100k documents, with the results being paged. This is done using the 'from' and 'size' query settings.
Click on any of the pages, with the results for that page being displayed. As it stands now this will throw an exception if the page exceeds the default result window.

The search needs to be ordered so using scan/scroll seems to be out of the question.

Thanks,

Brent

dadoonet · January 5, 2016, 10:11am

You can use scroll without scan so results are ordered.

nik9000 · January 5, 2016, 12:52pm

As David says you can scroll without scan. Other than. That you'd have to
raise the setting. Or you could prevent such deep scrolling in your
application. I've done that in the past.

I don't know of anything else you could do as it stands now.

I suspect it's technically possible to build an "after" style clause that'd
find you results who's scores are lower than some point. It wouldn't be
100% accurate because the data changes but its something.

bevans88 · January 5, 2016, 12:57pm

Thanks for the advice :). I'll stick with increasing the size as I still need the ability to return pages of results out of linear order.

dynamicscope · March 3, 2016, 12:35pm

Could you explain the reason behind this? Would it be dangerous to increase this setting?
I was using this for paginating more than 150,000 docs.

dynamicscope · March 3, 2016, 12:40pm

Nevermind, I found the reason.
Now, I am wondering if there is any effective way to paginate a large number of docs.
Would scan & scroll make a good pagination feature?

Deep Paging in Distributed Systems

To understand why deep paging is problematic, let’s imagine that we are searching within a single index with five primary shards. When we request the first page of results (results 1 to 10), each shard produces its own top 10 results and returns them to the coordinating node, which then sorts all 50 results in order to select the overall top 10.

Now imagine that we ask for page 1,000—results 10,001 to 10,010. Everything works in the same way except that each shard has to produce its top 10,010 results. The coordinating node then sorts through all 50,050 results and discards 50,040 of them!

You can see that, in a distributed system, the cost of sorting results grows exponentially the deeper we page. There is a good reason that web search engines don’t return more than 1,000 results for any query.

Ranjit_Shinde · April 12, 2016, 4:40am

Hi,
So with max_result_window default to 10K, does it mean i can not run query like
url?from=10000&size=200 on my indices.
For that i need to manually change the setting and do it?

Topic		Replies	Views
Max_result_window Elasticsearch	6	1310	July 23, 2017
Result window is too large, from + size must be less than or equal to: [10000] but was [11001] Elasticsearch	5	15606	July 5, 2017
Can i change the value of index.max_result_window from 10000 to 100000 ? as it is not allowing to retrieve 10001 record through pagination Elasticsearch	4	3329	January 24, 2017
Result window is too large, from + size must be less than or equal to: [10000] but was [10050] Elasticsearch	9	25553	April 23, 2018
Deep pagination best practices? Elasticsearch	3	1677	July 5, 2017

Index max_result_window

Related topics