Can't move large ES snapshots to different server

I'm having an issue with moving a Elasticsearch index to a different machine. I'm running Elasticsearch on a single machine inside a docker container (single node cluster). I have 2 indexes that are 28.6gb and 15gb large.

I have a file system repository that was created this way:

PUT /_snapshot/daily_backup
{
  "type": "fs",
  "settings": {
    "location": "/backups/daily"
  }
}

and a policy that was created this way:

PUT /_slm/policy/daily-snapshots
{
  "schedule": "0 0 20 * * ?",
  "name": "<daily-snap-{now/d}>",
  "repository": "daily_backup",
  "config": {
    "indices": ["indx_jan2020","indx_mrt2021"]
  },
  "retention": {
    "expire_after": "30d",
    "min_count": 5,
    "max_count": 50
  }
}

Until recently I was able to copy the contents of /backups/daily over to another server, create the same repository there and restore the indexes from one of the snapshots.
It seems this is not working anymore. The /backups/daily directory became quite large (350gb), probably because there are 30 snapshots for both indexes.

When I now copy the contents of /backups/daily to another server and create the repository, it just tels me that there are no snapshots in the repository.

I tried making just a single snapshot of one of those indexes by creating a new policy, I was then able again to copy it over to another machine and restore the index from it.

So now I am wondering what the reason could be that I can't use the large repository on another server to restore from.

Some things I have already checked:

  • Both servers are on the same ES version (7.16.2)
  • the filesystem rights for the /backups/daily directory are correct (1000:0)
  • The repository can be verified
  • no snapshots where running while copying the data
  • it doesn't seem to be a issue specific to the target machine (Tried the same action on a Azure VM, same result)
  • there are no errors in the logs whatsoever

Any ideas what could be the issue?

You're effectively taking a repository backup and then restoring it elsewhere, the proper procedure for which is covered in the manual.

The usual way to get the effect you report is described in the final paragraph:

When restoring a repository from a backup, you must not register the repository with Elasticsearch until the repository contents are fully restored. If you alter the contents of a repository while it is registered with Elasticsearch then the repository may become unreadable or may silently lose some of its contents.

Make sure you're not registering the repository too early.

I registered the repository after all files where in the right pace. Are there other circumstances that could cause this issue?

I can't think of any. There's no magic involved, the snapshot repository is just some files on disk which Elasticsearch reads when the repository is registered. Does the problem persist even if you unregister the repository and then register it again? Does setting "readonly": true on the repository make any difference? If you try creating a new snapshot in this repository, what happens?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.