Hi,
I need to migrate data between clusters and am looking to do via remindex from remote using curator. However, I'm seeing that the rate of reindexing is very slow. So, how can I up the reindexing rate?
Hi @theuntergeek,
I've tried running a sliced reindex and get this error:
Exception: TransportError(400, 'action_request_validation_exception', \"Validation Failed: 1: reindex from remote sources doesn't support workers > 1 but was [2]
Somehow I missed that you were doing a remote reindex (I answered another slow reindex question fairly concurrently, I may have thought this was similar at first glance). Only local reindex can do slices.
You also need to understand that Curator is just an index selection wrapper that makes standard Elasticsearch API calls. You could run this entire command inside Console in Kibana and you would get the exact same result you are seeing with Curator.
What this means is that the slowness can be accounted for by:
Network latency/speed
The performance of the remote cluster
The performance of the local cluster
The shard count of the target index (higher shard counts can increase indexing speed)
If you need reindex from remote to be faster, these are about the only ways you can accommodate to speed things up. Another means of transferring data from a remote to a local cluster would be snapshot/restore, where both clusters have access to the same network storage system (S3, GCP, Azure, etc.).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.