Why is search_after preferred over Scroll API?

dandago · December 27, 2021, 3:07pm

I've read that "We no longer recommend using the scroll API for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT)." (source: Paginate search results | Elasticsearch Guide [7.16] | Elastic )

However, I haven't been able to find any explanation WHY search_after is preferred and Scroll API is being cast aside. As far as I can tell, using search_after with PIT would require Elasticsearch to keep data around for the duration of the time window just like with Scroll API. Also, I've run a basic test comparing the two approaches, and it appears that search_after is actually slower and scales worse.

Johnnycc1 · December 28, 2021, 11:13am

Afternoon Daniel, I'm also interested as to why scroll is not being recommended going forward. And we are suggested to use search_after with a PIT - if we want to extract the data at the point in time of the first query.

I'm also interested to hear if the "scan" function within the Elasticsearch python client, which still uses scroll, is intended to be replaced with search after and a PIT.

system · January 25, 2022, 11:13am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
For exporting data shoud we use scroll or pit with search after? Elasticsearch	5	2574	October 19, 2022
Which is better between Scroll and Search_After when extract lots of document to other database? Elasticsearch	1	281	July 14, 2022
What about the Scroll API makes it a bad choice for paging large result sets? Elasticsearch	3	63	November 22, 2024
Recommendation to use search after instead of scrolling Elasticsearch	5	9104	April 22, 2021
Search_after vs deep pagination Elasticsearch	5	13611	June 20, 2017

Why is search_after preferred over Scroll API?

Related topics