Difference between various pagination APIs

Hi,

I was evaluating the performance of various pagination options. The ones that were considered were:

  1. Scroll API
  2. Search After API
  3. from,size fields

I carried out the test for a set of 1000 documents. And analyzed the profiling for each type of API for two queries.
query-a: Retrievel of first 500 documents.
query-b: Retrival of last 500 documents.

sorted by timestamp.

Observations.
Total number of hits for both queries (query-a and query-b), was 1000 for all types of APIs.

Questions:

  1. What is the difference in the implementation of these APIs if the number of hits is always same?
  2. How does each of API work? Do all APIs retreive all documents for each query. If so, what is the exact use of context in scroll API and search_after API?
    3, Which API is better in terms of performance and Why?

PS: Documentation doesn't provide any clue about how these APIs are implemented internally.

1 Like

So what was the result of your analysis — I'd be curious?

Generally, benchmarks with 1,000 results will IMO not show much difference. Take 1,000,000 or 10,000,000 and there will be one.

And there is not one better or worse approach; they are for different use cases:

  • Scroll to go through an entire dataset, mostly by non-humans, which just need to batch through large numbers of documents.
  • Search after for the continuous scrolling on a page.
  • from and size if you want to jump to a specific page, but avoid deep pagination since it is costly.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.