ES pagination under the hood

I am trying to implement pagination for quite heavy ES queries so I did some research on the options available, but documentation didn't answer some of my questions:

  1. search_after is the recommended way for doing the deep pagination because, unlike the offset style pagination, it doesn't require to load all the previous pages in memory and allows to resume the data from the cursor position (assuming no new documents appear). But how are this cursor and sorted query results being stored in ES internally? Should I be concerned with high inodes consumption if I'm using search_after with the queries that yield millions of hits (even if I only paginate through hundreds of them)? what about memory usage?

  2. Since search_after only allows to paginate in one direction, I am thinking of a workaround for going back: reverse the sort order and use the first entry's sort values to go to a previous page. What are the performance implications of reversing the query sort order like this? How does it change the resource consumption by ES?

I have found the official ES documentation to be quite terse when it comes to "how things actually work" so I will appreciate any deep dive / explanations / references here for a more solid understanding.

Thank you!

Welcome to our community! :smiley:

I think part of this comes down to this section on the docs page you linked to and which I have bolded;

Repeat this process by updating the search_after array every time you retrieve a new page of results. If a refresh occurs between these requests, the order of your results may change, causing inconsistent results across pages. To prevent this, you can create a point in time (PIT) to preserve the current index state over your searches.

So it doesn't save the sort order for you, it re-does it every time. You might want to consider using a [point-in-time] Point in time API | Elasticsearch Guide [8.7] | Elastic) request?

That said I know it probablyt doesn't answer the underlying question of how things work under the hood, but that's not an area I am super knowledgable on. I'll see if I can find someone who can comment though.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.