Elasticsearch Pagination: Scroll API

sundar.s · July 7, 2025, 8:16am

Hi,

Noticed, that the following note has been added for Scroll Pagination.

We no longer recommend using the scroll API for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT).

Could someone advise me on this to understand this better as we are intended to use scroll pagination for one of our use cases, where the results has to be fetched in a paginated fashion to be consumed by another consuming application.

Why this is not recommended for paging through more than 10,000 hits ? What is the impact of this ?

dadoonet · July 7, 2025, 10:35am

That's exactly the goal of pit + search_after.
This is much better than scroll API as there are some optimizations behind the scene.

Why this is not recommended for paging through more than 10,000 hits?

Actually, I think I read the sentence in another way than you did and we probably meant:

if you need to run data extraction for more than 10 000 hits, don't use from + size but search_after + pit.

Where previously it was:

if you need to run data extraction for more than 10 000 hits, don't use from + size but scroll

My 2 cents.

sundar.s · July 8, 2025, 6:23am

Thanks @dadoonet - From your message, I understand that pit + search_after is more optimized than scroll.
But in our use case, we are leaning towards scroll mostly, because,

Identifying an unique sortable attribute may be bit difficult. But in scrol API, we do not need any such sortable fields.
These producing(which fetches the data from the ES in paginated batches) and consuming services here are not going to run all the time. These agents may run only when required.
Probably, we can keep the hits less than 10, 000 most of the time, with proper search criteria

Would you still advise that, its better to use the pit+search_after instead of scroll? If at all, scroll what kind of impacts, we can expect.

dadoonet · July 8, 2025, 6:40am

Yeah. Sort by _doc. That's the most efficient way.

That's even better. You can fetch all the hits in one single query. Which means that you don't need to hold the pit.

sundar.s · July 21, 2025, 6:44am

Thanks David. This helps.

Could you please advise me on the following also,

What is the maximum keep alive duration for
a. Scroll API
b. PIT for Search_After
Thanks,
Sundar.

Topic		Replies	Views
For exporting data shoud we use scroll or pit with search after? Elasticsearch	5	3417	October 19, 2022
What about the Scroll API makes it a bad choice for paging large result sets? Elasticsearch	3	206	November 22, 2024
Why is search_after preferred over Scroll API? Elasticsearch	2	1759	January 25, 2022
Which is better between Scroll and Search_After when extract lots of document to other database? Elasticsearch	1	399	July 14, 2022
Search 1M data in elasticsearch using pagination Elasticsearch	3	1235	July 30, 2017

Elasticsearch Pagination: Scroll API

Related topics