For exporting data shoud we use scroll or pit with search after? Documentation says not to use scroll for deep pagination.
Assume that the data which we are querying would not change.
For exporting data shoud we use scroll or pit with search after? Documentation says not to use scroll for deep pagination.
Assume that the data which we are querying would not change.
I guess it will depend on which version you are using, but for Elasticsearch 8.4 the recommendation is:
We no longer recommend using the scroll API for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the
search_after
parameter with a point in time (PIT).
@dadoonet Sorry for tagging you but I just wanted to confirm this. I found your reply to below saying that _scroll
is preferred over search_after
for pulling whole dataset. Does it still apply ?
The documentation has not mentioned this clearly.
It still makes sense to me to use _scroll
since for search_after
I still need to move through the hits to find the sort value in the last document whereas in _scroll
I just use the scroll_id
for the next batch.
What @Christian_Dahlqvist said is 100% correct.
I should edit my old answer I guess.
Thank you @dadoonet
I also had another query about _scroll
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.