Proper way to dump indices from Elasticsearch and import to another Elasticsearch instance

kent010341 · July 29, 2022, 1:49am

I have a ELK stack deployed with docker container in a private network, and I have to export some of the indices to a device (such as a laptop) and import them to another Elasticsearch instance in another private network.

These two network aren't able to connect each other, so I have to save the indices as files or something.

The processes I found on internet is this (link). The steps to bring indices from instance A to instance B are:

Register snapshot repository with Create or update snapshot repository API with same name on A and B.
Create a snapshot with Create Snapshot API with same name on A and B. While creating snapshot on A, specify which indices are to export (for example, specifying "index_2022.07.29" with body {"indices":"index_2022.07.29"}).
Copy all contents inside the folder which snapshot repository registers with (A instance).
Remove all content inside the folder which snapshot repository registers with (B instance), and paste all contents from A to this folder.
Restart B Elasticsearch.
Restore at B with Restore Snapshot API, and specify "index_2022.07.29" with body {"indices":"index_2022.07.29"}

I don't know if there's an officially recommended processes to meet this export-import requirement. I am worried that the steps I found are risky or not a stable way.

And if I want to do the export-import process several times, for example, export index_2022.07.27, index_2022.07.28, index_2022.07.29 from A as separate folders (or files), and restore each index at B one by one, what should I do? (such as delete snapshot with Delete Snapshot API at the end of importing process.)

Thanks.

DavidTurner · July 29, 2022, 7:51am

That's almost the right process. You're pretty much taking a repository backup and then restoring it at the new location. However it's important that you don't modify the contents of a repository while it's registered with Elasticsearch (see these docs for details).

To follow the official repository backup process, you should unregister the repository on A before step 3, and you should not register the repository on B until after step 4.

kent010341 · July 29, 2022, 9:03am

@DavidTurner Thanks for your help! I try it and the index is successfully imported at the B instance!

I have an advanced use case that needs help, the process now is:

Register the snapshot repository at A instance.
Create a snapshot and specify one index at A instance.
Unregister the snapshot repository at A instance.
Copy all contents inside the folder which the snapshot repository registers with (A instance).
Paste all contents from A instance to the repository folder of B instance (at the first time, this folder is empty).
Register the snapshot repository at B instance.
Restart B Elasticsearch. (is it necessary?)
Restore and specify the index specified while creating the snapshot at A instance.

Like I said in my question, this process needs to be run several times.

If I do steps 1 to 8 in order to export and import "index_1", what else should I do if I want to do this export-import to "index_2"?

At A instance, I think I only need to register again and create another snapshot just like steps 1 to 4, but at B instance, should I directly paste all contents from A or delete all previous contents and paste all new contents?

Thanks.

DavidTurner · July 29, 2022, 10:06am

No, not if you don't change the repository contents after it's registered.

I'd start with a new empty repository each time.

kent010341 · July 29, 2022, 10:45am

@DavidTurner It seems like it's safe to delete or replace all contents after the repository is unregistered.

If I paste the other contents generated by a snapshot from any Elasticsearch instance to replace the original existing contents, and then register the repository, the new snapshot will be loaded and can be restored.

Therefore, what B importing with another snapshot from A should do is unregister the repository of B, replace all contents with the contents from A, register the repository of B, and then restore.

Use the step numbers I said in the previous reply, importing two indices one by one will be:

1 => 2 => 3 => 4 => 5 => 6 => 8 => unregister B repo => 1 => 2 => 3 => 4 => remove all contents of repository folder of B => 5 => 6 => 8

Is my understanding correct?

Thanks.

leandrojmp · July 29, 2022, 12:17pm

Does your elasticsearch deployments have internet access?

If they both have access to the internet it would be easier to have a cloud repository with your snapshots and share it, in the second deployment the repository would be read-only.

This has an extra cost, but if you need to do this frequently, maybe the amount of work that it will save you can justify this extra cost.

kent010341 · July 29, 2022, 12:52pm

@leandrojmp One of the deployments cannot access to the internet due to some reason, so I have to find a way to export those indices.

kent010341 · August 2, 2022, 2:50am

I've successfully done the export-import several times without any accident happening.
Thanks for the help.

Let me share my script written while testing with anyone who has the same requirement.

Export: GitHub
Import: GitHub

system · August 30, 2022, 2:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to restore an external snapshot Elasticsearch docker	3	477	September 15, 2021
How can we take import index in elasticsearch to export in new Index of elasticsearch...? Elasticsearch	4	819	August 10, 2020
What is the best way to copy an Elasticsearch index from one instance to another? Elasticsearch	1	624	July 6, 2017
Import indices from snapshot Elasticsearch	2	333	April 19, 2019
How to export/import an ES snapshot to/from an external harddrive Elasticsearch snapshot-and-restore	5	2641	December 30, 2021

Proper way to dump indices from Elasticsearch and import to another Elasticsearch instance

Related topics