Search_after vs deep pagination

animageofmine · May 22, 2017, 8:37pm

Documentation suggests that search_after is suggested compared to deep pagination, but doesn't seem to explain why. Or at least I didn't understand the details.

Could someone explain why we should use search_after vs deep pagination?

KodrAus · May 22, 2017, 10:43pm

So as I understand it, the issue with deep pagination using from + size is that Elasticsearch will load all documents into memory up to the page you ask for, then sort them and return the slice you asked for. This is the result window. The memory needed per request is proportional to from + size, not just size. So it's not good for paging through large result sets because the more results you skip, the slower your query becomes.

Scroll/search after let you do deep scrolling where you only fetch the next result set in an efficient manner, but can't jump between random pages. It solves the scaling problem of from + size. Scroll needs to maintain a context per scroll though, which isn't great for lots of users doing lots of scrolling.

Search after is like scroll, but stateless. So there isn't any additional state stored on the Elasticsearch end, so you don't have the scroll context scaling issue and you don't have the from + size scaling issue. It does share the constraint that you can only page forwards.

Usually I work around the random paging issue through user experience. Most people don't usually want to randomly page through results. They search, scroll a bit, filter, scroll a bit more, rinse-and-repeat.

If scaling isn't a problem for you though then from + size is a really simple solution to paging.

animageofmine · May 22, 2017, 11:05pm

Thank you for the response. This basically means search_after and scroll are exactly the same except the fact that search_after is stateless.

If the query includes paginating to page 500 with a page size of 100 along with sort on a couple of fields and a few filters, would ES still sort and load all documents to memory and only return the requested size?

Scaling definitely is a problem. Hitting a page post 5.4M leads to OOM (I tried increasing the default 10k to 15M for testing purpose).

KodrAus · May 23, 2017, 1:13am

There's probably more subtlety to it than that, someone else who knows more might come along and shed some more light on it As far as the end-user is concerned though I think it's safe to say they operate in the same way.

That's right, it will run your filters to determine which documents match, collect them, sort them and then only return the documents after from. Which as you pointed out doesn't scale so well. That's why the limit is 10,000 so you get an error suggesting you look for alternative solutions than OOM, which is bad news for your node.

animageofmine · May 23, 2017, 3:00pm

So, just to confirm, if ES runs filters and sorts the documents with a request for a document after 5M, wouldn't it possibly lead to OOM? I am just looking for a good solution for deep pagination with filters and sorts included in the query.

system · June 20, 2017, 3:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Search_after but with paging Elasticsearch	6	2302	January 2, 2023
Search_after queries are not performing as expected Elasticsearch	7	2797	December 14, 2021
On the advantage of search_after under the hood Elasticsearch	2	97	July 9, 2024
Why is search_after preferred over Scroll API? Elasticsearch	2	1312	January 25, 2022
Which is better between Scroll and Search_After when extract lots of document to other database? Elasticsearch	1	282	July 14, 2022

Search_after vs deep pagination

Related topics