Why is search_after preferred over Scroll API?

I've read that "We no longer recommend using the scroll API for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT)." (source: Paginate search results | Elasticsearch Guide [7.16] | Elastic )

However, I haven't been able to find any explanation WHY search_after is preferred and Scroll API is being cast aside. As far as I can tell, using search_after with PIT would require Elasticsearch to keep data around for the duration of the time window just like with Scroll API. Also, I've run a basic test comparing the two approaches, and it appears that search_after is actually slower and scales worse.

1 Like

Afternoon Daniel, I'm also interested as to why scroll is not being recommended going forward. And we are suggested to use search_after with a PIT - if we want to extract the data at the point in time of the first query.

I'm also interested to hear if the "scan" function within the Elasticsearch python client, which still uses scroll, is intended to be replaced with search after and a PIT.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.