Reindex data will not copy all documents

Hello,

I am trying to migrate data from an ES cluster 5.5.6 to ES 7.6.2, using an intermediate cluster ES 6.8 in order to reindex the data and prepare the mapping for ES 7.6.2 version.
I am using a python script to automate this migration(using the elasticsearch-py library), where i basically create a snapshot of the indices in my 5.5.6 cluster, recreate all indices and the corresponding settings and mapping in the intermediate 6.8 cluster, reindex the data, create another snapshot and restore it in the final 7.6.2 cluster.

My issue is that when i reindex the data in the intermediate 6.8 cluster, i don't get the same number of documents as the original index. I did not encounter any errors during the process, except the fact that sometimes the cluster will raise connection timeouts.
Below you can find exactly the parameters of the reindex operation:

reindexed_indices = esReindex.reindex({ "source": {"index": index['index']},"dest": {"index": index['index']+'_new'}, "script": { "source": "\n ctx._source.type = ctx._type;\n ctx._id = ctx._type + "-" + ctx._id;\n ctx._type = "_doc";\n " }}, wait_for_completion=True, slices=20)

Also tried to rerun the reindex by using the op_type: create option:

reindexed_indices = esReindex.reindex({"conflicts":"proceed", "source": {"index": index['index']},"dest": {"index": index['index']+'_new', "op_type":"create"}, "script": { "source": "\n ctx._source.type = ctx._type;\n ctx._id = ctx._type + "-" + ctx._id;\n ctx._type = "_doc";\n " }}, wait_for_completion=True, slices=20)

Did anyone encounter a similar issue and managed to fix it?

Thanks,
Diana

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.