Sliced scroll with sort

Burak · September 22, 2021, 1:36pm

Hello guys.
How does actually Elasticsearch perform sorting under the hood for shard and for index? I can find it neither in documentation or books.

I am going to use a sliced scroll with sort but in some cases, I have to sort the whole shard. I think it will work but I cannot confirm this.
Do you have a best practice for reading a big number of documents with sorting?

mayya · September 23, 2021, 8:39pm

For newer versions of Elasticsearch, we suggest to use PIT (point in time) instead of _scroll. PIT also supports slicing .

Burak · September 29, 2021, 8:52am

Thanks for your answer!
Could you please clarify do you recommend using PIT instead of the scroll for reading the whole index for example when I want to reprocess them, it may be 1 billion documents? Is it more efficient of memory or performance to use PIT?

mayya · September 29, 2021, 8:16pm

Yes, we recommend to use PIT instead of scroll for all cases as per these instructions . And we don't recommend to use scroll any more.

As for advantages of using PIT over scroll there are several:

a little less more memory usage. Each scroll stores a search request, as it is per request based. PIT is point in time index, so it doesn't store search requests, thus more search requests can be run at the same time.
more resilient, if a node with goes down during a series of PIT requests, an attempt will be made to make it on another node
PIT slices should be faster as it based on internal Lucene _doc ids rather than Elasticsearch doc _id field.
PIT is a new API that we plan to support for very long, while we may deprecate _scroll.

Burak · September 30, 2021, 7:05am

Thanks for the detailed responce!

system · October 28, 2021, 7:05am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
For exporting data shoud we use scroll or pit with search after? Elasticsearch	5	2306	October 19, 2022
Slicing without point in time Elasticsearch point-in-time	1	398	September 24, 2023
Scroll or sliced scroll with sorted results Elasticsearch	1	659	July 14, 2021
Retrieving sorted results using a point-in-time search with slicing Elasticsearch	1	179	November 15, 2023
Why is search_after preferred over Scroll API? Elasticsearch	2	1179	January 25, 2022

Sliced scroll with sort

Related topics