I want export my documents using Scroll API or search after request but I can't decide which one should I use. First of all, I have 2 different index.
First one is constant, I mean index document size is constant. I have 2 million of documents and it does not change (update or delete). It is fixed.
My other document also has millions of records, it depends actually. 4-5-6 millions and it can grow up. Documents are updated continuously.
My question is this. I want to export my documents but I want to do it in some time interval. Let's say. 20k documents per minute. I can do it with scroll api it is okay. However if my system is down I should not lose my scrollId so I can continue the export operation where I stopped. Or user want to export 100 documents for every minute and if I have 10 millions of records it takes long to export all my documents. Is that a bad operation and load for elasticsearch ? Because it takes snapshot of my index and my documents are updated continiously, that means the old documents are still kept for a long time. (Ofcourse I should close the scrollId after my job is finished.)
How long should I keeping search context alive ? Can I open scroll=6hours for let's say or is it bad practice to keep connection for 6hours ? The default time is 24h I think, if the default is 24 it should not be a problem ?
Or should I use search after and export my documents into batches and continue if my system is down where it stopped because I can know where it stopped from sort value or tie_breaker_id.
Any idea would help me. Thanks !