Not able to write data to ES using pig script - Connection error


#1

Hi,

I am able to write and get data from ES cluster (one master index node and a dedicated data node)but when I run pig script from same node getting connection error.
Pig version: 14
ES hadoop jar: elasticsearch-hadoop-5.0.0.jar

Note: I removed host:port information from stack trace

Error:

2016-11-04 23:19:57,310 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 2 Succeeded: 1 Running: 1 Failed: 0 Killed: 0 FailedTaskAttempts: 3, diagnostics=, counters=null
2016-11-04 23:24:15,397 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=FAILED, progress=TotalTasks: 2 Succeeded: 1 Running: 0 Failed: 1 Killed: 0 FailedTaskAttempts: 4, diagnostics=Vertex failed, vertexName=scope-39, vertexId=vertex_1477081880267_2745072_1_01, diagnostics=[Task failed, taskId=task_1477081880267_2745072_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:247)
at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:545)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:173)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:149)
at org.elasticsearch.hadoop.pig.EsStorage.putNext(EsStorage.java:189)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:129)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:378)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:243)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:362)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:192)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:184)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:184)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:180)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[host:port]]
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:150)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:444)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:424)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:428)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:154)
at org.elasticsearch.hadoop.rest.RestClient.remoteEsVersion(RestClient.java:609)
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:240)
... 23 more

extract from pig script:
store temp INTO '20161102/test' USING org.elasticsearch.hadoop.pig.EsStorage( 'es.resource = 20161102/test', 'es.nodes=host','es.port=port') ; -- NOTE have removed actual host and port

Any help what I am missing. Note that I can query and put data to ES cluster, can do telnet to the host:port from same node. So not sure why connectivity issue.

Thanks in advance.


#2

I got the solution. I had to set es.net.proxy.http.host and es.net.proxy.http.port as well to make that work.


(system) #3