Hi, I'm trying to automatize reindex operation. I'm looking for a robust way to determine that my reindex operation was indeed successful before start using new indexes. I always start reindex as an asynchronous operation and query for task status. Currently I need to support both Elasticsearch v6.8
and v7.3
.
In documentation of v6.8
Task API I've seen [LINK]:
This object contains the actual status. It is identical to the response JSON except for the important addition of the
total
field.total
is the total number of operations that the_reindex
expects to perform. You can estimate the progress by adding theupdated
,created
, anddeleted
fields. The request will finish when their sum is equal to thetotal
field.
However both versions describe total
in reindex response as [LINK]
total
(integer) The number of documents that were successfully processed.
When playing around with cluster in v7.3
I see that it more behaves as described in docs for v6.8
. When I force my reindex operation to fail (even synchronously) on some documents total still counts them. I see that some were created/updated but total
seems like sum of all operations.
In v7.3
task status for reindex operation returns also failures
[LINK]
failures
Array of failures if there were any unrecoverable errors during the process. If this is non-empty then the request aborted because of those failures. Reindex is implemented using batches and any failure causes the entire process to abort but all failures in the current batch are collected into the array. You can use the conflicts option to prevent reindex from aborting on version conflicts.
So in my synthetic situation I've got task that contains total
equal 13
where 6
documents were created and there are 7
failures for other documents that were not valid due due to mapping definition.
-
Is there a risk of
OOM
forv7.3
when lots of batches fail for many documents (like re-indexing very large index) causing this failures array to growth to enormous size? -
I could not find in docs if there will be always failures entries in case some documents could not be indexed?
-
Is is safe to assume that reindex was not fully successful when:
`total` - (`created` + `updated` + `noop?` + `deleted`) > 0
or is there better way to assert that inv6.8
andv7.3
?