Repository on kerberized HDFS

Hi!

I'm trying to use the HDFS reposity plugin (in version 5.6.5) to write snapshots to HDFS.
For testing, the HDFS was not kerberized and everything worked. The curl and the payload looked like this:

cat elastic-hdfs-backup-payload.json
{
  "type": "hdfs",
  "settings": {
    "uri": "hdfs://namenode1:9001/",
    "path": "/tmp",
  }
}

curl -X PUT -u ###:### "${ELASTIC_HOST}/_snapshot/my_hdfs_repository" --data "@elastic-hdfs-backup-payload.json"

The repository was created successfully and I could trigger some snapshots:

curl -X PUT -u ###:### "${ELASTIC_HOST}/_snapshot/my_hdfs_repository/snapshot_1?wait_for_completion=true"

But know our real environment is using a kerberized HDFS. The KDC (in our current testing environment) is a Docker container with an installed krb5-server on a centos7 base image.
It works for HDFS and also for some HDFS clients.

I followed the documentation here: https://www.elastic.co/guide/en/elasticsearch/plugins/5.6/repository-hdfs-security.html
So every node has the keytab in it.

But if I try to create the repository like this:

cat elastic-hdfs-backup-payload.json
{
  "type": "hdfs",
  "settings": {
  "uri": "hdfs://namenode1:9001/",
    "path": "/tmp",
    "security.principal": "principal@realm"
  }
}

following exception happend:

$ curl -X PUT -u ###:### "${ELASTIC_HOST}/_snapshot/my_hdfs_repository" --data "@elastic-hdfs-backup-payload.json"
{"error":{"root_cause":[{"type":"repository_exception","reason":"[my_hdfs_repository] failed to create repository"}],"type":"repository_exception","reason":"[my_hdfs_repository] failed to create repository","caused_by":{"type":"illegal_argument_exception","reason":"Can't get Kerberos realm","caused_by":{"type":"invocation_target_exception","reason":"invocation_target_exception: null","caused_by":{"type":"krb_exception","reason":"krb_exception: Cannot locate default realm"}}}},"status":500}%

The logfile of the master node put out a stacktrace with simpular 'Caused by' reasons.

Any ideas? Do I have to configure the kerbos realm at some place?

Best regards,
Soeren

Hi, sorry for the late response.

I would make sure that your krb5.conf file is correctly set up for your containerized KDC. The krb5.conf file is a configuration file on your filesystem that informs kerberos implementations and clients on how to act, including specifying the default REALM and the addresses to use when contacting the KDC. Usually, this is a file that is stored under /etc somewhere. Since the HDFS repo relies on Hadoop's Kerberos management code, and that relies on the underlying JVM kerberos implementation, there are additional settings that you can provide at JVM startup time that inform it where the krb5.conf file is located if it's not in the default location.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.