Creating/Restoring snapshot from one cluster to another

I have 2 clusters,

  • Cluster A with 2 nodes (Node A1 -> Master+Data, Node A2 -> Data) - Acts as development environment
  • Cluster B with 1 node (Node B1 -> Master+Data) - Acts as live environment

With this setup, I am trying to generate a snapshot with some indices (not all of them) in Cluster A and restore it in Cluster B. To do so,

  • All the nodes in Cluster A and Cluster B have path.repo set to /usr/local/elasticsearch/backups (not a shared location but local to all nodes)
  • I create snapshot on Cluster A using,
    $params = [
      'repository' => 'repo_name',
      'snapshot' => 'snap_test',
      'wait_for_completion' => true,
      'body' => [
        'indices' => 'indexA',
        'ignore_unavailable' => true,
        'include_global_state' => false
      ]
    ];

    $response = $client->snapshot()->create($params);
  • Then the entire /usr/local/elasticsearch/backups is transferred to Cluster B wherein, snapshot is restored using,
    $params = [
      'wait_for_completion' => true,
      'repository' => 'repo_name',
      'snapshot' => 'snap_test'
    ];

    $response = $client->snapshot()->restore($params);

The problem I am facing is, on restoring the snapshot in Cluster B, the log starts showing error messages in log (following is an example),

[2017-11-17T14:51:45,267][WARN ][o.e.c.a.s.ShardStateAction] [node-XXX-data] [indexA][1] received shard failed for shard id [[indexA][1]], allocation id [S1Da2f4FTsOoHdCtuptyeQ], primary term [0], message [failed recovery], failure [RecoveryFailedException[[indexA][1]: Recovery failed on {node-XXX-data}{FJIxmnJOTyWrP2r2sij3pw}{SqS2ySz4SK66X14qDbXtsQ}{127.0.0.1}{127.0.0.1:9300}]; nested: IndexShardRecoveryException[failed recovery]; nested: IndexShardRestoreFailedException[restore failed]; nested: ElasticsearchException[failed to create blob container]; nested: AccessDeniedException[/usr/local/elasticsearch/backups/indices/cJjWvkWwSNqOP-2PQAavIQ/1]; ]

Looking into the /usr/local/elasticsearch/backups directory created in Cluster A, the /usr/local/elasticsearch/backups/indices/cJjWvkWwSNqOP-2PQAavIQ/1 doesn't exist at all! This makes me think if I know the snapshot creation process at all.

I was under the impression that creating snapshot on Node A1 will pull all the distributed data across different nodes (in this case Node A2) and dump the snapshot at /usr/local/elasticsearch/backups which is essentially what I am looking for since the option to store data in cloud and/or shared location isn't available.

It would be great if someone could better explain the working of Snapshots (or point me to a blog/KB article) and help out with this situation :slight_smile:

Quick question.

When you are creating a snapshot from cluster A, you are saving the snapshot to location X. And cluster B will restore from location X as well.

Am I correct in reading that X is the same as path.repo ?

And if not, do both clusters have necessary permissions to read/write from/to X?

When you are creating a snapshot from cluster A, you are saving the snapshot to location X. And cluster B will restore from location X as well.
Am I correct in reading that X is the same as path.repo ?

Correct, location X is the same as path.repo however it isn't a shared location and instead local to the node.

And if not, do both clusters have necessary permissions to read/write from/to X?

They do!

One point I missed out in initial post was that the index gets restored partially (with red status) and then fails subsequently. Eg. When I tried it out recently, out of 714000 documents, only 351000 were restored.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.