Cannot connect to elasticsearch from hive using Aws instance and Docker Container


(Sanjay A P) #1

Hi,

I am running 2 different docker containers on AWS instance. One for HDP sandbox and other for elasticsearch. Both the containers are up and running on the respective port (i.e., HDP on 8080 and elasticsearch on 9400). Have also added the elastic-hadoop jar. But when i am trying to read data from elasticsearch , i am getting the below error

Blockquote

CREATE EXTERNAL TABLE hd2 (users String, message String) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource'='hdp/json', 'es.nodes'='5.21.33.44', 'es.port'='9400',"es.index.auto.create"="TRUE", 'es.nodes.wan.only'='TRUE','es.query'='?q=users:kimchy');

Blockquote

06:04:30,489 ERROR [60f09da6-50ba-40d1-8def-069859df4d29 HiveServer2-Handler-Pool: Thread-71]: rest.NetworkClient (:()) - Node [127.0.0.1:9400] failed (Connection refused (Connection refused)); no other nodes left - aborting...
2019-01-24T06:04:30,489 INFO [60f09da6-50ba-40d1-8def-069859df4d29 HiveServer2-Handler-Pool: Thread-71]: conf.HiveConf (HiveConf.java:getLogIdVar(5080)) - Using the default value passed in for log id: 60f09da6-50ba-40d1-8def-069859df4d29
2019-01-24T06:04:30,489 INFO [60f09da6-50ba-40d1-8def-069859df4d29 HiveServer2-Handler-Pool: Thread-71]: session.SessionState (:()) - Resetting thread name to HiveServer2-Handler-Pool: Thread-71
2019-01-24T06:04:30,493 INFO [HiveServer2-Handler-Pool: Thread-71]: conf.HiveConf (HiveConf.java:getLogIdVar(5080)) - Using the default value passed in for log id: 60f09da6-50ba-40d1-8def-069859df4d29
2019-01-24T06:04:30,494 INFO [60f09da6-50ba-40d1-8def-069859df4d29 HiveServer2-Handler-Pool: Thread-71]: conf.HiveConf (HiveConf.java:getLogIdVar(5080)) - Using the default value passed in for log id: 60f09da6-50ba-40d1-8def-069859df4d29
2019-01-24T06:04:30,489 WARN [HiveServer2-Handler-Pool: Thread-71]: thrift.ThriftCLIService (:()) - Error fetching results:
org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:467) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:910) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_191]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_191]
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_191]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_191]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.0.1.0-187.jar:?]
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at com.sun.proxy.$Proxy66.fetchResults(Unknown Source) ~[?:?]
at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:786) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'

But i am not sure why it is trying to connect into 127.0.0.0:9400 even though i have sepecified my ip in hive command

Any help would be greatly appreciated :slight_smile:

@james.baiera any suggestions ?


(James Baiera) #2

Try running this again but with 'hive.server2.logging.operation.level' set to "VERBOSE" in the hive settings. This should log all the debugging information to the console. If you can share here it would help the debugging process.