Repository HDFS change in security.prinicpal

(David F Quiroga) #1

Elasticsearch 5.6.6 with Repository HDFS plugin, HDFS kerberized

Just wanted to share my experience.

Short version:
Needed to restart following a change in security.prinicpal

Long version:
Attempted to update the security.prinicpal of an existing HDFS snapshot repository.

Created updated keytabs and deployed.
Deleted the existing Repository definition, and posted updated definition.
POST _snapshot/REPOSITORY/_verify was successful, as was a test snapshot.

But the next day _verify was failing with

 "type": "unchecked_i_o_exception",
    "reason": "unchecked_i_o_exception: Could not re-authenticate",
    "caused_by": {
      "type": "i_o_exception",
      "reason": "Login failure for OLDPRINICPAL@REALM from keytab elasticsearch-5.6.6/config/repository-hdfs/krb5.keytab: Unable to obtain password from user\n",
      "caused_by": {
        "type": "login_exception",
        "reason": "login_exception: Unable to obtain password from user\n"

Deleting and re-adding the repository would provide a short term fix.
In the end concluded that a restart was needed. After restarting the masters, the error message changed and provided the specific node where the connection was failing.

I would suspect Kerberos renewal held onto the old principal until restart.

(system) #2

