Restore are 20x slower after migrating from 5.x to 7.x

Hi,
Found some related topics but no actual answer.
We run daily snapshots in production, and every morning we wipe our test cluster and do a full restore. This way we test our backups and our staff has playground environment with very good data.

This worked fine with 5.x, took a bit less than an hour to complete, acceptable. However since switching to 7.x, it does not even complete in a day.

An empty index took 9.2m to recover, according to _cat/recovery, most of the empty ones took around that time.

I have tried fiddling with the settings:

"cluster": {
      "routing": {
        "allocation": {
          "node_concurrent_recoveries": "10",
          "node_initial_primaries_recoveries": "20"
        }
      }
    },
    "indices": {
      "recovery": {
        "max_bytes_per_sec": "250mb",
        "max_concurrent_file_chunks": "5"
      }
    }

But that does not change anything to the speed.
Is there anyway to get back to decent speed for recovery or is our use case not something supported anymore?

Thanks in advance

Welcome to our community! :smiley:

Can you check hot threads when you are restoring?
Also the output from Restore a snapshot | Elasticsearch Guide [8.3] | Elastic when you are restoring?

It ended up taking 5 days for 50Gb.
I cannot really take down the cluster anymore due to that problem preventing anyone from working in that case.

I tried deleting a single index and restoring it but I get an error:

unknown field name  [min_version]

even though both source and target are at the exact same version, 7.17.5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.