Hi ,
i have started using hadoop-elasticsearch spark module, and was successfully able to pull data from my dev box(running from Intellij) , from my production AWS elasticsearch service.
But when i create a jar including all dependencies and try to run from my spark cluster, then i face the below error,
16/07/22 21:26:16 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
16/07/22 21:26:16 INFO HttpMethodDirector: Retrying request
16/07/22 21:26:16 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
16/07/22 21:26:16 INFO HttpMethodDirector: Retrying request
and after a while i get below error,
org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed;[nodes with ip and port]
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:143)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:447)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:439)
at org.elasticsearch.hadoop.rest.RestRepository.scroll(RestRepository.java:454)
at org.elasticsearch.hadoop.rest.ScrollQuery.hasNext(ScrollQuery.java:92)
at org.elasticsearch.spark.rdd.AbstractEsRDDIterator.hasNext(AbstractEsRDDIterator.scala:43)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:262)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Any help is much appreciated.