Elastic cannot restore incremental snapshot in new cluster

Hello. I have created an s3 repository and created a full snapshot and several incremental snapshots (though the data didn't change since the full snapshot was taken. I'm still working on the development infrastructure). In order to validate the snapshots, I try to restore several indexes on a new cluster. But when I register the repository on the new cluster, it only sees the full snapshot, not the incremental snapshots. Both clusters run Elasticsearch 7.11.1. Do you have any suggestions / ideas on how I can restore the incremental snapshots?
Later edit: even if the incremental backups contain data that the full backup didn't, they still don't appear

What do you mean?

Can you run on both clusters:

GET /
GET /_cat/snapshots/REPO-NAME-HERE?v

And share the outputs here.

This is from the cluster where the backup was taken

id                         status start_epoch start_time end_epoch  end_time duration indices successful_shards failed_shards total_shards
backup-2021-03-23-h08-m39 SUCCESS 1616481554  06:39:14   1616481615 06:40:15       1m      81                89             0           89
backup-2021-03-23-h08-m52 SUCCESS 1616482370  06:52:50   1616482371 06:52:51     1.2s      81                89             0           89
backup-2021-03-23-h09-m05 SUCCESS 1616483146  07:05:46   1616483147 07:05:47       1s      81                89             0           89
backup-2021-03-23-h09-m08 SUCCESS 1616483294  07:08:14   1616483295 07:08:15       1s      81                89             0           89
backup-2021-03-23-h10-m02 SUCCESS 1616486534  08:02:14   1616486535 08:02:15     1.2s      81                89             0           89
backup-2021-03-23-h10-m04 SUCCESS 1616486685  08:04:45   1616486686 08:04:46     1.2s      82                90             0           90

This is from the cluster where I am trying to restore

id                         status start_epoch start_time end_epoch  end_time duration indices successful_shards failed_shards total_shards
backup-2021-03-23-h08-m39 SUCCESS 1616481554  06:39:14   1616481615 06:40:15       1m      81                89             0           89

I do not see what is wrong here. All snapshots are full snapshots, although data is reused between them which is sometimes referred to as the incremental part. What were you expecting to see?

I expected to see all backups from the first cluster appear on the second cluster

I am not sure I understand. As all backups are full you can only restore one.

Isn't there a mechanism that if I restore backup backup-2021-03-23-h10-m04 on the new cluster, the other backups are restored before him? I can only restore to the first backup taken?

All backups are full and contain the data in the cluster at the time it was taken. If segments have not changed between snapshots they are reused (snapshots keep track which segments are used by how many snapshots), which means a full copy of all data is not added to the repository for every snapshot - only segments that are new are added, which is where the incremental description comes from. You should probably always restore the latest snapshot only unless you want to go back to a previous point in time.

1 Like

You did not share the output of

GET /

On both clusters. Could you?

This is from the first cluster

{
  "name" : "...",
  "cluster_name" : "es-storage-backup",
  "cluster_uuid" : "...",
  "version" : {
    "number" : "7.11.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "ff17057114c2199c9c1bbecc727003a907c0db7a",
    "build_date" : "2021-02-15T13:44:09.394032Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

This is from the second cluster

{
  "name" : "...",
  "cluster_name" : "es-staging-restore",
  "cluster_uuid" : "...",
  "version" : {
    "number" : "7.11.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "ff17057114c2199c9c1bbecc727003a907c0db7a",
    "build_date" : "2021-02-15T13:44:09.394032Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Could you share the repository settings from both clusters?

I think I found the issue. So, the backup runs once, registers the repository on the test cluster, does a first restore and all goes ok. On the second run, it creates a new snapshot on the same repository, then on the test cluster it sends the create repository command, which now does nothing and doesn't update the list of known snapshots on the repository.
I'm not sure if it's a bug or FAD.

I think this means the repository in the new cluster is not registered with readonly: true, which is very dangerous. You must not register the same repository as writeable on two different clusters.

1 Like

I think I was not myself aware of this :stuck_out_tongue:
I thought it was something ok as long as the exact same version is used in both clusters.

I guess I should read the docs. :man_facepalming:

If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. Having multiple clusters write to the repository at the same time risks corrupting the contents of the repository.

1 Like

Thank you very much. I just tried to register the repository on the second cluster as readonly and it reloads the snapshots automatically.

1 Like