Deletions in reindex

I have an index which i want to reindex to a new index increasing the no of primary shards. So i started reindexing. I didn't make the source index read only so ingestions, updations and deletions were happening on the source index. Now when i am trying to sync the changes from old index to new index by starting reindexing again with version_type set to external. All the changes updations and ingestions have been done. But deletions in old index were not done in the new index. How to handle this?

Well. The reindex reads existing documents. Not deleted (so theorically not existing) documents.
I don't think of any magical idea to handle that.

If it's a one time operation, I'd probably try to compare all ids from the new index with the old one.
It's a heavy task, so you can may be find a smart way for this.
Like if you have a timestamp in your documents, count the number of docs per day (date_hsitogram agg) and for any day where the number of docs is different, compare each id from the destination with the source.
It should be easy as a:

HEAD index/_doc/id

404 if the document is not found.
2xx otherwise.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.