I migrate my index from ES 1.4.4 to 5.1.1 using "_reindex" API. The reindex process always aborts without any errors before finishing all records. Here are details:
Create a new index with settings in 5.1.1 ES server:
As you can see from 3), everything looks good. But, After a couple of hours, the task is done and only about 60K records got indexed in ES 5.1.1 server instead of 5653044. Repeated the process a few times, it always aborted without any errors.
You should be able to get the status of those reindex tasks with GET /_tasks/<taskId> where the <taskId> is whatever id was returned when you started. If you didn't store them then you should be able to look around with something like GET .tasks/_search. Those should contain the failure reason. Or, if it thinks it finished successfully, it should show you that.
Ok, command "..9200/.tasks/_search" is good one which gives me error info:
Error:
"type": "process_cluster_event_timeout_exception",
"reason": "failed to process cluster event (put-mapping) within 30s"
},
"status": 503```
Any way to change the timeout (30s) to longer time? I have many different doc types inside the index and under each doc type, i have up to more than 1k docs.
2) If any error occurs during the reindexing process, is there any way to ignore the error and continue the process ? I have a few M records, it's time-consuming if restarting it again. It's Ok if I lose some data in the reindex process
We don't need to store the errors and just continue to reindex next records without aborting due to last failure ... Any setting to make it happen? What do you mean by "reindexing in chunks"? we are using "size" for batching, right?
Use a query to limit what you are reindexing to certain days or namespaces or something. Whatever natural division your data has. Then do it again and again until you migrate all the data.
We are currently over a month behind on a migration because we have to babysit each and every reindex because of silent failures. We've resorted to setting the logger.root to "Debug". Good luck!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.