Restore Snapshot from Deployment-A to Deployment-B

Hi there. In Elastic Cloud, I'm trying to do the following:

From my local server (I want to automate it):

  1. issue a curl call of the Elastic API that tells Deployment-A to create a snapshot
  2. then issue a curl call of the Elastic API that tells Deployment-B to restore from the snapshot created in Step 1

Basically, I want the API calls for what happens when you perform the "Restore from another deployment" option in Kibana (attached).

TIA

Hi,

you can read this docs:

Snapshot and restore | Elasticsearch Guide [8.14] | Elastic

Create snapshot API | Elasticsearch Guide [8.14] | Elastic

Restore snapshot API | Elasticsearch Guide [8.14] | Elastic

Regards

Thanks, but I have read those before. They don't address what I described in the screenshots. I have reached out to support, and they admit this is probably some behind-the-scenes magic that they aren't exposing. Waiting for a response.

1 Like

I'm also very interested in doing this as well.

I'm looking for an automated way to restore Deployment-B from Deployment-A so I can periodically have my test environment clone the state of Deployment-A

Update:

I did some work this week that has almost accomplished what you're looking for here but I have a problem. My steps were:

  1. "GET" "/_snapshot/found-snapshots" from deployment-A's API and store repo_info = response.json()["found-snapshots"]
  2. register the repo on deployment-B as read-only:
@dataclass
class ElasticCluster:
    """Connection details for Elasticsearch API."""

    elasticsearch_host: str
    elasticsearch_api_key: str

# ...
repo_info = copy.deepcopy(repo_info)
# https://www.elastic.co/guide/en/elasticsearch/reference/current/repository-s3.html
repo_info["settings"]["readonly"] = True
# repo_info["settings"]["compress"] = False

# where es_request is just a wrapper I made around httpx that adds a header with my API key stored in target_cluster and uses elastic_search_host to update the URL
res = es_request(
    cluster=target_cluster,
    method="PUT",
    path=f"/_snapshot/{target_repo}",
    json=repo_info,
)

  1. Now when I visit /app/management/data/snapshot_restore/repositories/ on Deployment-B's kibana, and try to "Verify repository" on the new repo, I have an error that suggests the S3 client from Deployment-A is not accessible to deployment-B.
"Unknown s3 client name [elastic-internal-XXXXX]. Existing client configs: elastic-internal-YYYYY,default"

Interestingly this isn't a problem on Deployment-C for which I initially used the UI in elastic cloud to restore from a snapshot of Deployment-A. So I think the UI magically granted Deployment-C access to Deployment-A's S3 client. I'm now looking for a way to do this programatically...

  1. The last step I would want to do is hit Deployment-B's API requesting a restore from a desired snapshot in the repo I registered above.

Also btw I started a related thread here in February 2024 Restore from found-snapshots across clusters

Final update (sorry to spam this thread but I know somebody will appreciate this being documented in the future):

I was able to get this working by just using the S3 client settings belonging to Deployment-B to configure the read-only snapshot bucket form Deployment A.

# updated code:
def register_repo(
    target_cluster: ElasticCluster,
    target_repo: str,
    remote_repo: Dict,  # result of get_repo_info()
    s3_client: str,  # S3 client inside **target** env to use for reading remote repo
) -> bool:
    """
    Registers the provided snapshot repo, `remote_repo`, inside of `target_cluster` for read (only) access under the name `target_repo`.
    Returns True on success.
    Visit /app/management/data/snapshot_restore/repositories in Kibana to see repos.
    https://www.elastic.co/guide/en/elasticsearch/reference/current/put-snapshot-repo-api.html
    https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshots-register-repository.html
    """

    remote_repo = copy.deepcopy(remote_repo)
    # https://www.elastic.co/guide/en/elasticsearch/reference/current/repository-s3.html
    remote_repo["settings"]["readonly"] = True
    remote_repo["settings"]["client"] = s3_client  # e.g. can't use PRD client in TST
    # repo_info["settings"]["compress"] = False

    res = es_request(
        cluster=target_cluster,
        method="PUT",
        path=f"/_snapshot/{target_repo}",
        json=remote_repo,
    )


def get_repo_info(cluster: ElasticCluster, repo_name: str) -> Optional[Dict]:
    """Get metadata about a snapshot repo e.g. {'type': 's3', 'uuid': ...}"""
    res = es_request(cluster, "GET", f"/_snapshot/{repo_name}")
    if res.status_code != 200:
        logger.warning(
            f"failed to get repo info for '{repo_name}': status {res.status_code}, {res.text}"
        )
        return None
    return res.json()[repo_name]


def es_request(
    cluster: ElasticCluster, method: str, path: str, **httpx_kwargs
) -> httpx.Response:
    """
    Wrapper for making an arbitrary request to Elasticsearch.
    Based on core/app/utils/elasticsearch_http.py but synchronous for simplicity.
    """

    headers = {
        "Authorization": f"ApiKey {cluster.elasticsearch_api_key}",
        "Accept-Encoding": "gzip",  # prefer compressed responses
    }

    es_url = cluster.elasticsearch_host.rstrip("/") + "/" + path.lstrip("/")
    res = httpx.request(method, es_url, headers=headers, **httpx_kwargs)
    return res

This works for my existing environments, but interestingly didn't work for a new environment I made from scratch temporarily to test (which got an AWS access error). This is good enough for me now because my existing environments are all I care about.

The alternative solution to all of this: would theoretically be to create my own S3 bucket in my own aws account, configure it in Deployment-A as a snapshot bucket (via the API) with a snapshot lifecycle plan, configure in my other environments as a read only bucket, and provide credentials to all my deployments so they can read from this bucket (and so Deployment-A can write to it). At least here you'd have full control of the access management of each environment and that wouldn't be so opaque.

And for step 4 I first make an API call that deletes all the indices I'm going to restore, prior to the restore API call.