How to ensure unique sorting with search_after in Elasticsearch 8?

jiel · February 28, 2025, 5:50pm

Hello,

I am working on a tool that needs to retrieve large batches of records from an Elasticsearch index. The recommended method used to be the Scroll API, but it is now deprecated in favor of search_after as stated in the documentation.

I want to use sorting criteria that are agnostic to the document content. Sorting by @timestamp and _id seems appropriate, but sorting on _id is now disabled by default in Elasticsearch 8.

If I sort only by @timestamp, this value is not unique, which means I could miss some records.

So is there a way to efficiently retrieve large volumes of data (>10'000) while ensuring no records are skipped, using sorting criteria independent of document content?

strawgate · March 1, 2025, 2:14pm

Can you use the point in time API with search after? It adds a unique tiebreaker using the shard doc value.

If not, and you don't have a unique value then the flow that I use is:

Get a search_after batch of 10k documents
Grab the timestamp from the last document (aka max_timestamp)
Iterate through the batch while timestamp < max_timestamp doing whatever processing is required
On the last iteration grab the sort values for your next search_after call

This effectively means you're excluding the documents with max timestamp from processing and ensuring that all documents with max_timestamp will be present in your next search_after call

jiel · March 5, 2025, 9:52am

Thanks William. I tried to sort on tie_breaker_id as the example from the documentation but the query no longer returned any results (It is theoretically available, my instance is in version 8.17).

Anyway, I got a working solution by sorting on the _shard_doc field.

system · April 2, 2025, 9:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Search after - unique sort fields Elasticsearch	1	466	August 7, 2018
Correct way to do tiebreaking with search_after query without PIT Elasticsearch	1	1320	April 27, 2023
Skipped records with Search/SearchAfter query Elasticsearch	2	583	November 1, 2022
Query on search_after Elasticsearch	2	1332	August 30, 2018
How to get unique documents with Search_After Elasticsearch	1	192	August 21, 2023

How to ensure unique sorting with search_after in Elasticsearch 8?

Related topics