I am stuck in a situation where my Elasticsearch(7.9.0) upgradation is pending due to backup /snapshot not being in place
We have 10 ELK node in cluster based in Azure cloud out of which 3 master nodes and 7 data nodes. Our requirement is to create snapshot of elk indecides in HDFS nodes. Our hdfs nodes are on-premise cluster which means both elk and hdfs are in seperate vlans.
We have open neccessary ports fro HDFS which is 8020/8022/9865/9866/9867/9871 these ports are rpc and http ports of namenode and datanodes of hdfs respectively.
we have also created keytab file for Elasticsearch by the name krb5.keytab which is correct one as i can do kinit to that keytab.
After all this when i try the API request to create hdfs repository, i am getting error as-
{
"error" : {
"root_cause" : [
{
"type" : "repository_verification_exception",
"reason" : "[my_hdfs_repository] path is not accessible on master node"
}
],
"type" : "repository_verification_exception",
"reason" : "[my_hdfs_repository] path is not accessible on master node",
"caused_by" : {
"type" : "i_o_exception",
"reason" : "Could not get block locations. Source file "/user/Elasticsearch/repositories/my_hdfs_repository/tests-FEy4DNwjS-C8OR8t2dzCzg/pending-master.dat-UCIz69_kTNSFsgNEuBsDtg" - Aborting…block==null"
}
},
"status" : 500
}
I have installed repository-hdfs of version 7.9.0 same as Elasticsearch version.
I have tried all possible workaround and even re-installed again yet the issue persists.
Do any one of you have any suggestions or advice on this case please do let me know, it will be very helpful and also will be a learning point.
Hi @Sahana_Murthy. Do you have all of your Elasticsearch nodes configured to talk to HDFS? Can you do something like hadoop dfs -ls /user/Elasticsearch/repositories from all of them?
You are right that you do not need the hdfs client on the Elasticsearch nodes -- but if you did I thought that would be an easy way to test that you can reach the HDFS cluster from your Elasticsearch nodes. It's definitely not required though
Assuming that your hdfs cluster is healthy and reachable from all of your Elasticsearch nodes, and that you are using a user who has permission to read/write to hdfs, I would probably suspect kerberos next.
First, enable logging levels for the HDFS repository:
PUT /_cluster/settings
{
"transient": {
"logger.org.elasticsearch.repositories.hdfs":"DEBUG",
"logger.org.apache.hadoop":"DEBUG"
}
}
Then set this in the environment where you are running your Elasticsearch node:
HADOOP_JAAS_DEBUG=true
And finally, specify the following jvm parameters when you start Elasticsearch:
So your command to start Elasticsearch might look like the following (although you might already have some other arguments you're passing in that you'll need to add)
You will get a lot of additional information in your Elasticsearch log file. A common problem we see is that the encryption type of your keytab is no longer supported by newer JVMs. If you want to post your logs here for additional advice, be careful to redact anything sensitive.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.