Proper way to dump indices from Elasticsearch and import to another Elasticsearch instance

I have a ELK stack deployed with docker container in a private network, and I have to export some of the indices to a device (such as a laptop) and import them to another Elasticsearch instance in another private network.

These two network aren't able to connect each other, so I have to save the indices as files or something.

The processes I found on internet is this (link). The steps to bring indices from instance A to instance B are:

  1. Register snapshot repository with Create or update snapshot repository API with same name on A and B.
  2. Create a snapshot with Create Snapshot API with same name on A and B. While creating snapshot on A, specify which indices are to export (for example, specifying "index_2022.07.29" with body {"indices":"index_2022.07.29"}).
  3. Copy all contents inside the folder which snapshot repository registers with (A instance).
  4. Remove all content inside the folder which snapshot repository registers with (B instance), and paste all contents from A to this folder.
  5. Restart B Elasticsearch.
  6. Restore at B with Restore Snapshot API, and specify "index_2022.07.29" with body {"indices":"index_2022.07.29"}

I don't know if there's an officially recommended processes to meet this export-import requirement. I am worried that the steps I found are risky or not a stable way.

And if I want to do the export-import process several times, for example, export index_2022.07.27, index_2022.07.28, index_2022.07.29 from A as separate folders (or files), and restore each index at B one by one, what should I do? (such as delete snapshot with Delete Snapshot API at the end of importing process.)


That's almost the right process. You're pretty much taking a repository backup and then restoring it at the new location. However it's important that you don't modify the contents of a repository while it's registered with Elasticsearch (see these docs for details).

To follow the official repository backup process, you should unregister the repository on A before step 3, and you should not register the repository on B until after step 4.

@DavidTurner Thanks for your help! I try it and the index is successfully imported at the B instance!

I have an advanced use case that needs help, the process now is:

  1. Register the snapshot repository at A instance.
  2. Create a snapshot and specify one index at A instance.
  3. Unregister the snapshot repository at A instance.
  4. Copy all contents inside the folder which the snapshot repository registers with (A instance).
  5. Paste all contents from A instance to the repository folder of B instance (at the first time, this folder is empty).
  6. Register the snapshot repository at B instance.
  7. Restart B Elasticsearch. (is it necessary?)
  8. Restore and specify the index specified while creating the snapshot at A instance.

Like I said in my question, this process needs to be run several times.

If I do steps 1 to 8 in order to export and import "index_1", what else should I do if I want to do this export-import to "index_2"?

At A instance, I think I only need to register again and create another snapshot just like steps 1 to 4, but at B instance, should I directly paste all contents from A or delete all previous contents and paste all new contents?


No, not if you don't change the repository contents after it's registered.

I'd start with a new empty repository each time.

@DavidTurner It seems like it's safe to delete or replace all contents after the repository is unregistered.

If I paste the other contents generated by a snapshot from any Elasticsearch instance to replace the original existing contents, and then register the repository, the new snapshot will be loaded and can be restored.

Therefore, what B importing with another snapshot from A should do is unregister the repository of B, replace all contents with the contents from A, register the repository of B, and then restore.

Use the step numbers I said in the previous reply, importing two indices one by one will be:

1 => 2 => 3 => 4 => 5 => 6 => 8 => unregister B repo => 1 => 2 => 3 => 4 => remove all contents of repository folder of B => 5 => 6 => 8

Is my understanding correct?


Does your elasticsearch deployments have internet access?

If they both have access to the internet it would be easier to have a cloud repository with your snapshots and share it, in the second deployment the repository would be read-only.

This has an extra cost, but if you need to do this frequently, maybe the amount of work that it will save you can justify this extra cost.

@leandrojmp One of the deployments cannot access to the internet due to some reason, so I have to find a way to export those indices.

I've successfully done the export-import several times without any accident happening.
Thanks for the help.

Let me share my script written while testing with anyone who has the same requirement.

Export: GitHub
Import: GitHub

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.