Repository-hdfs error while creating repository to backup elasticsearch index to HDFS

Sahana_Murthy · March 17, 2022, 6:21am

Hello everyone,

I am stuck in a situation where my Elasticsearch(7.9.0) upgradation is pending due to backup /snapshot not being in place
We have 10 ELK node in cluster based in Azure cloud out of which 3 master nodes and 7 data nodes. Our requirement is to create snapshot of elk indecides in HDFS nodes. Our hdfs nodes are on-premise cluster which means both elk and hdfs are in seperate vlans.
We have open neccessary ports fro HDFS which is 8020/8022/9865/9866/9867/9871 these ports are rpc and http ports of namenode and datanodes of hdfs respectively.

we have also created keytab file for Elasticsearch by the name krb5.keytab which is correct one as i can do kinit to that keytab.

After all this when i try the API request to create hdfs repository, i am getting error as-
{
"error" : {
"root_cause" : [
{
"type" : "repository_verification_exception",
"reason" : "[my_hdfs_repository] path is not accessible on master node"
}
],
"type" : "repository_verification_exception",
"reason" : "[my_hdfs_repository] path is not accessible on master node",
"caused_by" : {
"type" : "i_o_exception",
"reason" : "Could not get block locations. Source file "/user/Elasticsearch/repositories/my_hdfs_repository/tests-FEy4DNwjS-C8OR8t2dzCzg/pending-master.dat-UCIz69_kTNSFsgNEuBsDtg" - Aborting…block==null"
}
},
"status" : 500
}
I have installed repository-hdfs of version 7.9.0 same as Elasticsearch version.

I have tried all possible workaround and even re-installed again yet the issue persists.

Do any one of you have any suggestions or advice on this case please do let me know, it will be very helpful and also will be a learning point.

-Sahana

casterQ · March 17, 2022, 6:33am

You need to grant the read and write permission on HDFS path to the user who starts es

Sahana_Murthy · March 17, 2022, 8:58am

Hello Caster,

I have already given permission to 777 for Elasticsearch user :
[root@ ]# hdfs dfs -getfacl /user/Elasticsearch/repositories/my_hdfs_repository

file: /user/Elasticsearch/repositories/my_hdfs_repository

owner: Elasticsearch

group: supergroup

user::rwx
user:Elasticsearch:rwx
group::r-x
mask::rwx
other::r-x

Yet the same error output -
"caused_by" : {
"type" : "i_o_exception",
"reason" : "Could not get block locations. Source file "/user/Elasticsearch/repositories/my_hdfs_repository/tests-_31sguBJT_qzP06HqW5X4g/pending-master.dat-ULPvWhOvT7i1b3JRcjYm0A" - Aborting...block==null"
}

Keith_Massey · March 21, 2022, 6:21pm

Hi @Sahana_Murthy. Do you have all of your Elasticsearch nodes configured to talk to HDFS? Can you do something like hadoop dfs -ls /user/Elasticsearch/repositories from all of them?

Sahana_Murthy · March 22, 2022, 12:43pm

Hello Keith,

No I cannot list the hdfs directory from Elasticsearch nodes because my elk cluster and hadoop cluster are hosted on different servers.

I am trying to store elk backup remotely in hdfs cluster.

My elk cluster is azure based where as hdfs is on-premise.

Isn't why hdfs plugin helps to store remote backup of elk index?

Regards
Sahana

Keith_Massey · March 22, 2022, 8:23pm

You are right that you do not need the hdfs client on the Elasticsearch nodes -- but if you did I thought that would be an easy way to test that you can reach the HDFS cluster from your Elasticsearch nodes. It's definitely not required though

Keith_Massey · March 23, 2022, 1:39pm

Assuming that your hdfs cluster is healthy and reachable from all of your Elasticsearch nodes, and that you are using a user who has permission to read/write to hdfs, I would probably suspect kerberos next.
First, enable logging levels for the HDFS repository:

PUT /_cluster/settings
{
  "transient": {
    "logger.org.elasticsearch.repositories.hdfs":"DEBUG",
    "logger.org.apache.hadoop":"DEBUG"
  }
}

Then set this in the environment where you are running your Elasticsearch node:

HADOOP_JAAS_DEBUG=true

And finally, specify the following jvm parameters when you start Elasticsearch:

-Djava.security.debug=logincontext -Dsun.security.krb5.debug=true

So your command to start Elasticsearch might look like the following (although you might already have some other arguments you're passing in that you'll need to add)

HADOOP_JAAS_DEBUG=true ES_JAVA_OPTS="-Djava.security.debug=logincontext -Dsun.security.krb5.debug=true" bin/elasticsearch

You will get a lot of additional information in your Elasticsearch log file. A common problem we see is that the encryption type of your keytab is no longer supported by newer JVMs. If you want to post your logs here for additional advice, be careful to redact anything sensitive.

system · April 20, 2022, 1:40pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
HDFS Repository - ES 2.3.3 - Unable to create repo with kerberos Elasticsearch es-hadoop	1	1017	July 6, 2017
HDFS respository issue Elasticsearch	2	479	February 19, 2018
Repository verification exception when creating HDFS Repository Elasticsearch es-hadoop	2	787	May 7, 2019
When I am registering a repository for snapshot, I have kept the path. repo at all my master and data nodes again am unable to get the repository at verifying repository it is showing not connected and this below error. I have no idea what to do further Elasticsearch	4	259	July 28, 2022
Repository creation fails (disk quota) Elasticsearch snapshot-and-restore	1	439	February 22, 2022

Repository-hdfs error while creating repository to backup elasticsearch index to HDFS

file: /user/Elasticsearch/repositories/my_hdfs_repository

owner: Elasticsearch

group: supergroup

Related topics