S3 backup configuration error

Hello,
I'm trying to configure AWS S3 repository for backups but facing access issues to the S3 bucket.
I did the following steps:

  1. Installed repository-s3 plugin on all cluster nodes
  2. restarted all nodes after the plugin installation
  3. I'm using elasticsearch-keystore to configure the credentials for S3, I have set up both s3.client.default.access_key and s3.client.default.secret_key.
  4. Configuring the repository through Kibana and getting this error when I'm trying to test the repository:

    Seems that it tries to get the credentials from attached IAM role rather than using the credentials in the keystore.

How can I make it consider the keystore configuration?

Thanks,
Lior

Hi @Lior_Yakobov,

Can you confirm that you've set up the S3 credentials in all ES nodes?

Thank you!

Hey @afharo,
Yes I have created the keystore pairs on all cluster nodes.
Does the creation of keystore pairs also requires restart to Elasticsearch process in order for Elasticsearch to recognize it?

Thanks,
Lior

There is no need to restart Elasticsearch, although you might need to call the /reload_secure_settings API to get all the nodes to re-read them: Secure settings | Elasticsearch Reference [7.10] | Elastic

Hey @afharo,
Thanks for the heads-up, I probably missed it.
So after running the reload_secure_settings request I do see that test files were created in the S3 bucket, but still Kibana complains about something else now:

 {
  "error": {
    "root_cause": [
      {
        "type": "repository_verification_exception",
        "reason": "[bucket-elasticsearch-repo] [[DmPiHBoGSxWOxcAUnSEHKw, 'RemoteTransportException[[aws-elkdb22][10.128.115.52:9300][internal:admin/repository/verify]]; nested: RepositoryMissingException[[bucket-elasticsearch-repo] missing];']]"
      }
    ],
    "type": "repository_verification_exception",
    "reason": "[bucket-elasticsearch-repo] [[DmPiHBoGSxWOxcAUnSEHKw, 'RemoteTransportException[[aws-elkdb22][10.128.115.52:9300][internal:admin/repository/verify]]; nested: RepositoryMissingException[[bucket-elasticsearch-repo] missing];']]"
  },
  "status": 500
}

Is there something else I'm missing here?

Thanks again,
Lior

What version are you running? Do you have any voting-only nodes in the cluster?

Hey @DavidTurner,
our cluster is version 7.9.3, and I believe that by voting-only nodes you mean the Kibana nodes, as they configured this way:

node.data: false
node.master: false
node.ingest: true
node.ml: false

If that's what you meant, so yes we have 2 of these kind of nodes.

Thanks,
Lior

No I meant nodes with node.voting_only: true. Any of them?

Hey @DavidTurner,
actually I don't have nodes of this type in the cluster.

Lior

Hey @DavidTurner , @afharo
despite the message I received from testing the repository, seems that I managed to perform a successful backup.
I will try to complete a recovery as well and if all goes well then we're good to go.

Thanks,
Lior

Hey @DavidTurner, @afharo,

So I managed to perform both snapshot and restore, although when I'm pressing the repository verification I still get this error message:

 {
  "error": {
    "root_cause": [
      {
        "type": "repository_verification_exception",
        "reason": "[bucket-elasticsearch-repo] [[DmPiHBoGSxWOxcAUnSEHKw, 'RemoteTransportException[[aws-elkdb22][10.128.115.52:9300][internal:admin/repository/verify]]; nested: RepositoryMissingException[[bucket-elasticsearch-repo] missing];']]"
      }
    ],
    "type": "repository_verification_exception",
    "reason": "[bucket-elasticsearch-repo] [[DmPiHBoGSxWOxcAUnSEHKw, 'RemoteTransportException[[aws-elkdb22][10.128.115.52:9300][internal:admin/repository/verify]]; nested: RepositoryMissingException[[bucket-elasticsearch-repo] missing];']]"
  },
  "status": 500
}

Is there any reason for this error message to appear, even though snapshots are working?

Thanks,
Lior

There's some kind of discrepancy with the config of aws-elkdb22 vs how it appears in the cluster state. If you restart that one node does the problem go away?

@DavidTurner, Thank you for the help,
by restarting this node I noticed in the log file that it refuses to start since repository-s3 plugin was missing. I guess that somehow I missed the plugin installation on this specific node, but now everything looks just fine.

Thanks again and best regards,
Lior

1 Like