We have implemented pagination using search_after and sorting the results by _score and a unique id field as a tie-breaker. However sometimes we are getting duplicate results across pages, and other times matches do not appear in any of the pages.
For example, when there are 65 total hits and paginated using page size of 10, the last page has 6 results instead of 5. A document appears in both page 5 and page 6.
Another example, query matches 13,552 documents with the exact same score but total results from all pages is only 5,900.
This is happening consistently even when the index hasn't been updated between requests. Curiously this happens when a replica shard has a different size than the primary shard but with the same number of documents. Recreating the replica fixes the problem and result count matches up.
This post suggests score can be used in search_after
We are using elastic 7.13.2 in a 3 node cluster with 5 shards and a replication factor of 1.
If your unique field is indeed unique it should not happen. It means that there were index updates in between search requests, or possibly they were not propagated to the replica.
The recommended way to do search_after in newer versions (including 7.13) is to use point in time search. In this case you don't need to provide your unique tie-breaker field, it will be provided automatically and is called _shard_doc.
Thanks for your answer. We are integrating search_after query to our UI pagination. If we use PIT and keep the search context alive between UI page fetches, which could be minutes depending on user think time, would that scale well?. We have to support at least a few hundred concurrent users.
Also how would that affect other searches and background ingestion process?. Documentation says lucene segment merging is impacted by open PIT contexts. We have a fairly rapidly changing large index.
For scroll requests we have a limitation for the max number of open scroll context of 500, because PIT contexts are much more lightweight, we don’t have any limit on the number of PIT contexts, so you can open as many PIT contexts as possible. We probably need to introduce some limitation though. In the worst case scenario, when you constantly open PIT contexts with a very long keep_alive parameter and constantly update your indices, you may ran out of file descriptors or heap memory, because as you rightly noticed segments used by PIT contexts are being kept and not being deleted by merge.
On other hand, if you use relatively small keep_alive , say 10-15 mins, use high enough refresh_interval not to create many segments, and regularly monitor the number of PIT contexts with GET /_nodes/stats/indices/search than probably it will work fine.