Repository-hdfs error while creating repository to backup elasticsearch index to HDFS

Hello everyone,

I am stuck in a situation where my Elasticsearch(7.9.0) upgradation is pending due to backup /snapshot not being in place
We have 10 ELK node in cluster based in Azure cloud out of which 3 master nodes and 7 data nodes. Our requirement is to create snapshot of elk indecides in HDFS nodes. Our hdfs nodes are on-premise cluster which means both elk and hdfs are in seperate vlans.
We have open neccessary ports fro HDFS which is 8020/8022/9865/9866/9867/9871 these ports are rpc and http ports of namenode and datanodes of hdfs respectively.

we have also created keytab file for Elasticsearch by the name krb5.keytab which is correct one as i can do kinit to that keytab.

After all this when i try the API request to create hdfs repository, i am getting error as-
{
"error" : {
"root_cause" : [
{
"type" : "repository_verification_exception",
"reason" : "[my_hdfs_repository] path is not accessible on master node"
}
],
"type" : "repository_verification_exception",
"reason" : "[my_hdfs_repository] path is not accessible on master node",
"caused_by" : {
"type" : "i_o_exception",
"reason" : "Could not get block locations. Source file "/user/Elasticsearch/repositories/my_hdfs_repository/tests-FEy4DNwjS-C8OR8t2dzCzg/pending-master.dat-UCIz69_kTNSFsgNEuBsDtg" - Aborting…block==null"
}
},
"status" : 500
}
I have installed repository-hdfs of version 7.9.0 same as Elasticsearch version.

I have tried all possible workaround and even re-installed again yet the issue persists.

Do any one of you have any suggestions or advice on this case please do let me know, it will be very helpful and also will be a learning point.

-Sahana

You need to grant the read and write permission on HDFS path to the user who starts es

Hello Caster,

I have already given permission to 777 for Elasticsearch user :
[root@ ]# hdfs dfs -getfacl /user/Elasticsearch/repositories/my_hdfs_repository

file: /user/Elasticsearch/repositories/my_hdfs_repository

owner: Elasticsearch

group: supergroup

user::rwx
user:Elasticsearch:rwx
group::r-x
mask::rwx
other::r-x

Yet the same error output -
"caused_by" : {
"type" : "i_o_exception",
"reason" : "Could not get block locations. Source file "/user/Elasticsearch/repositories/my_hdfs_repository/tests-_31sguBJT_qzP06HqW5X4g/pending-master.dat-ULPvWhOvT7i1b3JRcjYm0A" - Aborting...block==null"
}

Hi @Sahana_Murthy. Do you have all of your Elasticsearch nodes configured to talk to HDFS? Can you do something like hadoop dfs -ls /user/Elasticsearch/repositories from all of them?

Hello Keith,

No I cannot list the hdfs directory from Elasticsearch nodes because my elk cluster and hadoop cluster are hosted on different servers.

I am trying to store elk backup remotely in hdfs cluster.

My elk cluster is azure based where as hdfs is on-premise.

Isn't why hdfs plugin helps to store remote backup of elk index?

Regards
Sahana

You are right that you do not need the hdfs client on the Elasticsearch nodes -- but if you did I thought that would be an easy way to test that you can reach the HDFS cluster from your Elasticsearch nodes. It's definitely not required though

Assuming that your hdfs cluster is healthy and reachable from all of your Elasticsearch nodes, and that you are using a user who has permission to read/write to hdfs, I would probably suspect kerberos next.
First, enable logging levels for the HDFS repository:

PUT /_cluster/settings
{
  "transient": {
    "logger.org.elasticsearch.repositories.hdfs":"DEBUG",
    "logger.org.apache.hadoop":"DEBUG"
  }
}

Then set this in the environment where you are running your Elasticsearch node:

HADOOP_JAAS_DEBUG=true

And finally, specify the following jvm parameters when you start Elasticsearch:

-Djava.security.debug=logincontext -Dsun.security.krb5.debug=true

So your command to start Elasticsearch might look like the following (although you might already have some other arguments you're passing in that you'll need to add)

HADOOP_JAAS_DEBUG=true ES_JAVA_OPTS="-Djava.security.debug=logincontext -Dsun.security.krb5.debug=true" bin/elasticsearch

You will get a lot of additional information in your Elasticsearch log file. A common problem we see is that the encryption type of your keytab is no longer supported by newer JVMs. If you want to post your logs here for additional advice, be careful to redact anything sensitive.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.