I have a case where i need to fetch (very) large number of documents.
i have a list of 15000 entity id's that i need to export data for.
My docs have entity_id field.
What i do so far, is partitioning this input list of Id's and then for every partition, use
terms and time
range query to fetch the data using scroll. My partition size is 100.
Is there a better/faster way of doing that ?
Any ideas ?
What you describe sounds like a reasonable way to do this.
Does this mean you're only retrieving 100 documents in each batch? I would expect a larger batch size to be faster if so. You'll need to experiment to find the best value for your system.
Also if you're only retrieving 100 documents each time then you don't need to scroll.
It means that in one batch i am fetching all the data for 100 entities (out of 15000) which can be hundreds of thousands document.
So I do need scroll.
In SQL world this would be for each partition :
select * from data where entity_id in (1,2,3,...100) and timestamp between xxx and yyy
Ok, got it. In that case, yes, a scroll seems like a good idea.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.