Result window is too large, from + size must be less than or equal to: [10000] but was [10050]

I am trying to do pagination for my data in elastic search. But I was not able fetch beyond 10000.

I found that it is doable through "top hits aggregation"

https://www.elastic.co/guide/en/elasticsearch/reference/6.2/search-aggregations-metrics-top-hits-aggregation.html

where I can set from+size greater than 10000.

I just want to verify that what I am doing for pagination is advisable way or not.

Please suggest.

By default the offset + limit is limited to 10,000. This can be modified at a cluster level, but I would seriously advise against doing do.

When paginating in this manner, Elasticsearch has to parse the query, build the search context, distribute the query to applicable shards, coalate the results, skip past $offset items, then read out $limit items and destroy the search context for each page which means that the deeper we paginate, each page is more expensive than the page before it.

Oy.

The 10,000 limit is there for a reason.

Thankfully, Elasticsearch has a scroll API, which reuses search context and position from one request to the next. You should use it when you need to paginate deeply.

1 Like

To sum up:

  • educate your users. You normally don't have to click 1000 times on "next page" to get the result you were looking for. Think about Google or Qwant. Do you often go more than page 1?
  • add a way to change the sort order. So last page comes in first page.
  • if your need is to extract all the data to do some other processing later, then as @yaauie said, scroll API is the way to go. It has a big advantage. When you scroll, whatever happens on your index (new documents added for example), you will get consistent results.
  • if you really need to do deep pagination, look at the search_after feature. It has been designed for that. Note that you can basically just do "next page" with that but not "go to page 1476".

HTH

1 Like

Thank you @yaauie for the detailed explanation.

The 10,000 limit is there for a reason.

Yeah.. I get it. But I wonder why this limit is not applied for "top hits aggregation". I am able to get more than 10000 documents using top hits aggregation.

Is it because aggregations are handling in a better way than direct querying or Elasticsearch missed to validate window size in aggregations ?

Please help me to understand

Thank you once again :slight_smile:

Thank you @dadonet for you precise explanation :slight_smile:

Please help me to understand why the window size limit is not considered in "top hits aggregation". I am able to fetch more than 10000 docs using this aggregation.

It is because aggregations are better in handling it or elastic search missed to validate window size in aggregations.

Thanks in advance
Have a great day @dadonet :slight_smile:

This looks like a bug to me. @jimczi WDYT?

1 Like

I agree we should honor this setting in TopHitsAggregation. I opened https://github.com/elastic/elasticsearch/issues/29190 for this purpose.

2 Likes

I answered too quickly sorry. There's already a limit in the top_hits aggregation that defaults to 100 called index.max_inner_result_window. Though this limit is per bucket so it is possible to return more than 10,000 documents if you return more than 100 buckets in a parent aggregation.

1 Like

Oh ok..
I am using elasticsearch 5.5 and that's why I was able to fetch.
Thanks @jimczi @dadoonet for taking time to answer my question. :slight_smile:
Have a nice day.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.