I was wondering, from a performance perspective (more specifically,
cranking through the data as quickly as possible), which one is better if I
wanted to scroll through a large-ish (hundred of gigabytes to a few
terabytes) index with an ordered field (e.g. all docs have a date field):
- Do many small scrolls, one each for each non-overlapping interval
(again, for example, if I know beforehand that there's only 1 year of data,
then 12 scrolls, one for each month)
- Go ahead and just do a normal scroll with match_all or something similar.
The reason I axk is because it was mentioned in previous posts that the
deeper you go into a scoll, the slower it gets. Would this technique
alleviate that? What are the tradeoffs? Also, would the answer change if it
was a single node cluster versus a multi-node cluster?
Thanks in advance!