Reindex automatization - detection of failures / missing documents

Hi, I'm trying to automatize reindex operation. I'm looking for a robust way to determine that my reindex operation was indeed successful before start using new indexes. I always start reindex as an asynchronous operation and query for task status. Currently I need to support both Elasticsearch v6.8 and v7.3.

In documentation of v6.8 Task API I've seen [LINK]:

This object contains the actual status. It is identical to the response JSON except for the important addition of the total field. total is the total number of operations that the _reindex expects to perform. You can estimate the progress by adding the updated , created , and deleted fields. The request will finish when their sum is equal to the total field.

However both versions describe total in reindex response as [LINK]

(integer) The number of documents that were successfully processed.

When playing around with cluster in v7.3 I see that it more behaves as described in docs for v6.8. When I force my reindex operation to fail (even synchronously) on some documents total still counts them. I see that some were created/updated but total seems like sum of all operations.

In v7.3 task status for reindex operation returns also failures [LINK]

Array of failures if there were any unrecoverable errors during the process. If this is non-empty then the request aborted because of those failures. Reindex is implemented using batches and any failure causes the entire process to abort but all failures in the current batch are collected into the array. You can use the conflicts option to prevent reindex from aborting on version conflicts.

So in my synthetic situation I've got task that contains total equal 13 where 6 documents were created and there are 7 failures for other documents that were not valid due due to mapping definition.

  1. Is there a risk of OOM for v7.3 when lots of batches fail for many documents (like re-indexing very large index) causing this failures array to growth to enormous size?

  2. I could not find in docs if there will be always failures entries in case some documents could not be indexed?

  3. Is is safe to assume that reindex was not fully successful when: `total` - (`created` + `updated` + `noop?` + `deleted`) > 0 or is there better way to assert that in v6.8 and v7.3?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.