Deep pagination vs scroll & Search After

Cen_giz · June 19, 2017, 4:51pm

Dears,

My use-case:

Export up to 1 million documents of size 5K each for EXCEL EXPORT with 5GB output. This is not for real-time users and will be used by 1 user at any point of time. Using Elastic 2.4:

a) Use deep-pagination up to 20 K limit and allow user to keep changing the range to export all data: This option will use 20 K * 5 Shards = 100 K documents * 5 K each = 500 MB memory is used at any point in time when each time a page is visited. After a new search is made with the next batch of 20 K, the memory being used previously will be garbage collected. So at any point of time, 500 MB is used in JVM.

b) Use Scrolling API instead of deep pagination and keep scrolling: How much memory will be used with each slice? What is the cost of this operation to the CPU and JVM? Hom much JVM space is used at any point in time? Sorting/Aggreagtion is not enabled, but this can be done on the excel.

c) Use Search After instead of deep pagination: How much memory will be used with each slice? What is the cost of this operation to the CPU and JVM? Hom much JVM space is used at any point in time?

Which one(s) suits best with their pros/cons?

Thanks in advance...

jpountz · June 20, 2017, 2:43pm

Your estimation of the memory usage of deep pagination is not correct. When you ask Elasticsearch for the 100th page, it still needs to retrieve all results from the previous 99 pages since it can't predict how documents that come from different shards compare to each other.

system · July 18, 2017, 2:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deep Pagination vs Scroll API Elasticsearch	2	1125	July 5, 2017
Deep Pagination with scroll(100 millions of docs) could be a problem? Elasticsearch	7	9148	February 11, 2017
Search 1M data in elasticsearch using pagination Elasticsearch	3	1136	July 30, 2017
Scrolling or slicing? Elasticsearch	5	1764	April 27, 2017
For exporting data shoud we use scroll or pit with search after? Elasticsearch	5	2306	October 19, 2022

Deep pagination vs scroll & Search After

Related topics