HDFS Repository and Name Nodes setting

I could create and do snapshot /restore successfully with hdfs repo using kerbros authentication. However in the config If I use namenode setting as defined in hadoop hdfs-site.xml file:, it fails with error

{"error":{"root_cause":[{"type":"invocation_target_exception","reason":"invocation_target_exception: null"}],"type":"repository_exception","reason":"[flowtest_hdfs_repository] cannot create blob store","caused_by":{"type":"runtime_exception","reason":"runtime_exception: java.lang.reflect.InvocationTargetException","caused_by":{"type":"invocation_target_exception","reason":"invocation_target_exception: null","caused_by":{"type":"illegal_argument_exception","reason":"java.net.UnknownHostException: hadoop.log.labs","caused_by":{"type":"i_o_exception","reason":"hadoop.log.labs"}}}}},"status":500}

If URI is replaced with actual active node name address m1.hadoop.log.labs:8020, It works fine.

Question is how to specify namenodes in the URI or what other config is needed so that name node automatically use active name node as URI.

In the config setup :

"uri": "hdfs://hadoop.log.labs:8020/" ,
"conf.dfs.namenode.rpc-address.hadoop.log.labs.nn1": "m1.hadoop.log.labs:8020",
"conf.dfs.namenode.rpc-address.hadoop.log.labs.nn2": "m2.hadoop.log.labs:8020",

I'm assuming you're talking about configuring the repository to use HA Namenode. Here is an example request with everything that you'll need to set:

curl -X PUT \
  http://localhost:9200/_snapshot/hdfsrepo \
  -H 'Content-Type: application/json' \
  -d '{
	"type": "hdfs",
	"settings": {
		"uri": "hdfs://ha-hdfs/",
		"path": "/user/elasticsearch/existing/repository",
		"security.principal": "elasticsearch@REALM"
		"conf.dfs.nameservices": "ha-hdfs",
		"conf.dfs.ha.namenodes.ha-hdfs": "nn1,nn2",
		"conf.dfs.namenode.rpc-address.ha-hdfs.nn1": "m1.hadoop.log.labs:8020",
		"conf.dfs.namenode.rpc-address.ha-hdfs.nn2": "m2.hadoop.log.labs:8020",
		"conf.dfs.client.failover.proxy.provider.ha-hdfs": "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
	}
}'

Note that I am not using a port in the uri - just the nameservice.

Also note that the nameservice (ha-hdfs) is explicitly called out with conf.dfs.nameservices and the namenode ids are called out with conf.dfs.ha.namenodes.<nameservice>.

If you are using HA Namenodes, you should only be specifying the port numbers for the namenodes on the namenode addresses, not the URI. Only the nameservice goes in the URI.

Finally, you must also configure the client failover proxy (the last config entry) or else if your active namenode fails over, the repository will be unavailable until it comes back online. This proxy actually does the client level failover logic.

Edit: This post previously had incorrect configurations. The configurations should now be correct.

Thanks for info. Going to try out suggestions and will share the results.

Ajay

Followed suggested config, experimented with additing addional http/https addresses but it still failed with unknown host. Same config replacing URI with active name node and port worked fine. It appeared it is still trying to resolve uri using DNS rather than name node service. Did I misconfigured any config parameter ?

curl -k -X PUT https://elastic:xxxxxxxxx@localhost:9200/_snapshot/test_hdfs_repository
-H 'Content-Type: application/json' -d '{
"type": "hdfs",
"settings": {
"uri": "hdfs://hadoop.log.labs/",
"path": "/elasticsearch/respositories/test",
"security.principal": "xxxx@YYY.ZZZ",
"conf.dfs.nameservices": "hadoop.log.labs",
"conf.dfs.ha.namenodes.hadoop.log.labs": "nn1,nn2",
"conf.dfs.namenode.rpc-address.ha-hdfs.nn1": "m1.hadoop.log.labs:8020",
"conf.dfs.namenode.rpc-address.ha-hdfs.nn2": "m2.hadoop.log.labs:8020",
"conf.dfs.namenode.http-address.hadoop.log.labs.nn1": "m1.hadoop.log.labs:50070",
"conf.dfs.namenode.http-address.hadoop.log.labs.nn2": "m2.hadoop.log.labs:50070",
"conf.dfs.namenode.https-address.hadoop.log.labs.nn1": "m1.hadoop.log.labs:50470",
"conf.dfs.namenode.https-address.hadoop.log.labs.nn2": "m2.hadoop.log.labs:50470",
"conf.dfs.client.failover.proxy.provider.ha-hdfs": "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
}
}'
{"error":{"root_cause":[{"type":"invocation_target_exception","reason":"invocation_target_exception: null"}],
"type":"repository_exception","reason":"[test_hdfs_repository] cannot create blob store",
"caused_by":{"type":"runtime_exception","reason":"runtime_exception: java.lang.reflect.InvocationTargetException",
"caused_by":{"type":"invocation_target_exception","reason":"invocation_target_exception: null",
"caused_by":{"type":"illegal_argument_exception",
"reason":"java.net.UnknownHostException:hadoop.log.labs",
"caused_by":{"type":"i_o_exception","reason":"hadoop.log.labs"}}}}},"status":500}

Would you be able to share the logs from the Elasticsearch node here for that exception?

Here are selective log lines (due to limits)

[2018-10-26T14:30:05,130][WARN ][r.suppressed             ] path: /_snapshot/netsectest_hdfs_repository, params: {repository=netsectest_hdfs_repository}
org.elasticsearch.transport.RemoteTransportException: [nets_m02][10.236.233.168:9300][cluster:admin/repository/put]
Caused by: org.elasticsearch.repositories.RepositoryException: [netsectest_hdfs_repository] cannot create blob store
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.blobStore(BlobStoreRepository.java:336) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.startVerification(BlobStoreRepository.java:635) ~[elasticsearch-6.4.2.jar:6.4.2]
        
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: org.elasticsearch.common.io.stream.NotSerializableExceptionWrapper: runtime_exception: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.fs.AbstractFileSystem.newInstance(AbstractFileSystem.java:136) ~[?:?]
        at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:165) ~[?:?]
        at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:250) ~[?:?]
        
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
Caused by: org.elasticsearch.common.io.stream.NotSerializableExceptionWrapper: invocation_target_exception: null
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:?]
        
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: hadoop.log.labs
        at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) ~[?:?]
        at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:130) ~[?:?]
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:343) ~[?:?]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
Caused by: java.io.IOException: hadoop.log.labs
        at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) ~[?:?]
        at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:130) ~[?:?]
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:343) ~[?:?]
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:287) ~[?:?]
        at org.apache.hadoop.fs.Hdfs.<init>(Hdfs.java:91) ~[?:?]
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:?]
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_181]
        at org.apache.hadoop.fs.AbstractFileSystem.newInstance(AbstractFileSystem.java:134) ~[?:?]
        at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:165) ~[?:?]
        at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:250) ~[?:?]
        at org.elasticsearch.repositories.hdfs.HdfsRepository.lambda$createBlobstore$0(HdfsRepository.java:130) ~[?:?]
        at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
        at javax.security.auth.Subject.doAs(Subject.java:360) ~[?:1.8.0_181]
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1787) ~[?:?]
        at org.elasticsearch.repositories.hdfs.HdfsRepository.createBlobstore(HdfsRepository.java:128) ~[?:?]
        at org.elasticsearch.repositories.hdfs.HdfsRepository.lambda$createBlobStore$1(HdfsRepository.java:228) ~[?:?]
        at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
        at org.elasticsearch.repositories.hdfs.HdfsRepository.createBlobStore(HdfsRepository.java:227) ~[?:?]
        at org.elasticsearch.repositories.hdfs.HdfsRepository.createBlobStore(HdfsRepository.java:53) ~[?:?]
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.blobStore(BlobStoreRepository.java:332) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.startVerification(BlobStoreRepository.java:635) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.repositories.RepositoriesService.lambda$verifyRepository$2(RepositoriesService.java:218) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) ~[elasticsearch-6.4.2.jar:6.4.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.