Determine when clone index has completed

madshov · December 13, 2021, 2:49pm

Hi,
I'm writing some functionality where step 1 would be to backup the current ES index. For this I'm using the _cloneIndex request. This seems like the most appropriate choice. But how would I know for sure when that has finished? Is it enough to check if the cluster status is yellow or green?

Would it be smarter to use _reindex instead to create a copy of my index? I just realised the _clone request requires the index to be in block-write mode, which will cause a problem with all new incoming index requests while cloning.

DineshNaik · December 14, 2021, 4:56am

The cloning process can be monitored with the _cat recovery API, or the cluster health API can be used to wait until all primary shards have been allocated by setting the wait_for_status parameter to yellow .

The _clone API returns as soon as the target index has been added to the cluster state, before any shards have been allocated. At this point, all shards are in the state unassigned . If, for any reason, the target index can’t be allocated, its primary shard will remain unassigned until it can be allocated on that node.

Once the primary shard is allocated, it moves to state initializing , and the clone process begins. When the clone operation completes, the shard will become active . At that point, Elasticsearch will try to allocate any replicas and may decide to relocate the primary shard to another node.

And Yes , To clone an index the index must be marked as read-only and have a cluster health status of green.
The documentation is pretty good and has all details you need. please refer:

you can check the prerequisites section for both of the options.

madshov · December 14, 2021, 12:19pm

Hi Dinesh,
Thanks for your reply and links to documentation. One thing I was wondering is the following scenario. Let's say I have a big index that I want to backup into another index. I assume this will take a short while to create the new index and create all the documents there as well. What happens to new documents that are indexed during that period? If clone requires read-only, I assume these new index requests will fail. Should I be using reindex instead?

DineshNaik · December 14, 2021, 2:14pm

@madshov ,

yes, you can go for reindex api , but even with this, the suggested approach would be to stop the ingestion if possible during the process.

if that's not a possibility then you might have to take care of delta updates in the new index.

You can create an alias if possible. The benefit of using the alias is that we can avoid downtime and easily roll back the migration if there is something wrong with the new index. That’s because just switching the alias can be completed quickly.

madshov · December 16, 2021, 11:52am

Hi @DineshNaik
Thanks for you advice.

system · January 13, 2022, 11:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Clone index failing Elasticsearch docker	2	1041	January 5, 2022
ALLOCATION_FAILED Clone Index Elasticsearch	5	329	March 28, 2024
How long does it take to clone an index with 2TB of data? Elasticsearch	3	644	May 12, 2023
Error when clone index Kibana	2	626	April 11, 2022
Ingest issue during re-indexing/cloning? Elasticsearch	6	302	March 27, 2023

Determine when clone index has completed

Related topics