How to get list of documents created by reindex API in destination index

I have a query around the reindex API . Is there a way to get a list of all the documents created by the reindex API in the destination index?
Context:
I am using the reindex API to migrate a few indices from a remote ES cluster. The API works fine. But if for some reason the API crashes, let's say due to some issue with ES there is no way I can start it from where it left. The API again starts from first doc. Although the API ignores conflicted docs in my case and it does not cause any problem in my case but I have millions of docs in my index and starting over again in case of failure is very time consuming.
If I had a list of docs that were created by the API before crashing, I can resume the migration form that point.Any help around this or any alternative solutions are appreciated. Thank you!

If these documents were already created using reindex before the crash, you can set the op_type parameter to create in the dest section. This will allow for reindex to only create missing documents in the destination. You can also read more about it here

1 Like

Agreed, I won't have a problem of duplicate documents.
But my problem is that the reindex API will still scan through all the documents and then either ignore docs that are already present or index new docs accordingly. Now because I have millions of docs, this scanning through redundant docs might take a lot of time.
So, I was just wondering if this could be optimised.

I agree a crash is very rare but I may also have to stop the API if I see increased load on ES and then resume again. So having the ability to resume the reindex API from where I left last time would have really helped!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.