I'm trying to use the backup and restore functionality via the kopf plugin in order to populate the production data to a test server.
The production setup consists of 3 nodes. The test setup is one node.
I'm able to do what I need with one of the two indexes (1 shard, 1 replica).
But when I try to do this with the second index (2 shards, 1 replica), the restore renders the index unusable because of unassigned shards.
In order to setup the test server, I needed to make it an exact copy of the production server (3 nodes etc) which is a waste of resources. Is what I'm trying to do even possible?
What do you mean by "unusable"? Is the index in a red state?
You should be able to restore a three-node cluster into a single-node cluster. The replicas will be unassigned, because they won't allocate to the same node as the primary shards. But all the primary shards should allocate and your cluster will be in a "yellow" state.
Hm, I'm not sure tbh. Replicas aren't saved when you create a snapshot (since it's just extra copies of the same data, no need to back it up). They are rebuilt after the primary shards have been restored to the cluster.
So if your cluster is staying red, there is either a problem with the primary shards you are trying to restore, or some kind of external factor that is preventing all the primaries from restoring onto the new cluster.
Is there a same-named index that you are trying to restore "on top" of?
Ok I'm trying it locally with a docker image.
I'm restoring two indices one by one. First the small, this has one shard and it's relatively small (100Mb). At first the shard was in INIT state and remained like that for some minutes. After that
and got to START state and the cluster was yellow and accessible again. Now I'm doing the same with the big index (1,5 Gb). There are two shards both in INIT state. I'm checking the progress via the recovery api. Maybe the problem was that when I was doing it in the elastic cloud setup, I didn't give it enough time? If that's the case (because I waited for about 15mins) why did it finish in just a few minutes when I had the exact same setup (number of nodes etc)? Is the recovery faster when using more nodes?
It will take some time to recover indices, depending on size. Not sure how much time to be honest, it depends on hardware + index size. 15 min does sound like a long time for just 1.5gb though.
There are a few things that affect recovery speed. More nodes == more disk IO and network resources, so recovery tends to go faster. There are also throttling settings that limit how many concurrent recoveries a single node can be doing (to prevent a lot of recoveries from swamping a single node in production), so having more nodes means more parallel recoveries and you won't hit the throttling.
Dunno if that's what you ran into. It may have been something else, but I'm still thinking