Correct settings for "es.nodes.wan.only"


(Ankit Singh) #1

Hello,

I am trying to run a spark job to load data from emr to ES cluster hosted by elastic.co. [ cluster ID "665e60" ]

Following is code snippet i am using to test this.

import java.io.PrintStream
import org.apache.spark.SparkContext    
import org.apache.spark.SparkContext._
import org.elasticsearch.spark._ 
val conf = new SparkConf()
  conf.set("spark.es.nodes","ssssss.us-east-1.aws.found.io")
  conf.set("spark.es.port","9243")
  conf.set("spark.es.nodes.discovery","ture")
  conf.set("spark.es.nodes.client.only","false")
  conf.set("spark.es.nodes.wan.only","false")`indent preformatted text by 4 spaces`
  conf.set("spark.es.net.http.auth.user","sssss")
  conf.set("spark.es.net.http.auth.pass","lololol")
 val sc = new SparkContext(conf)    
//  print(conf.toDebugString)
val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3)
val airports = Map("arrival" -> "Otopeni", "SFO" -> "San Fran")
sc.makeRDD(Seq(numbers, airports)).saveToEs("spark/docs")

Which results in the following error.
> org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
> at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:196)
> at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:379)
> at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
> at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
> at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)

Can someone please suggest what config am i missing here ?




(James Baiera) #2

Your Spark driver is able to connect to the ES Cloud instance, but it seems that the Executors do not have the capability to establish a connection. This is normally the case when running in cloud environments, as most cloud environments keep all of their executors/task runners inside of a separate secured network. You will need to make sure that the executors/task runners are able to connect to the provided ES node by configuring the network settings of your deployment.


(Ankit Singh) #3

Thanks James,
I was able figure this out.


(Animageofmine) #4

@darthapple Can you share what was the issue and how did you fix?


#5

@darthapple what was the resolution for this?
@animageofmine were you able to resolve this? I am facing the same issue. Please help.


(Joby Johny) #6

I am facing issue now. Could you please let me know the fix details


(Mayank Vijay) #8

Hello,
I just want to view data stored in elastic search in form of hive table. The hive query i run is:
CREATE EXTERNAL TABLE testHiveELKTable (account int, quantity int) STORED BY 'org.elasticsearch.hasoop.hive.EsStoragwHandler' TBLPROPERTIES('es.resource' = 'index/type');

But i get,
Failed:Execution Error, return code 1 fromorg.apache.hadoop.hive.ql.exec.DDLTask. org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version- this typically happens if network/elasticsearch cluster is not accessible or when targetting a WAN/Cloud instance without proper setting in 'es.nodes.wan.only'

I even tried setting es.nodes.wan.only = true but still error shows.


(donghe90) #9

This issue was a headache for me as well. I bypass this by adding executors' ipaddress to AWS elasticsearch access policy. Hope this help.


(Mohammed Sheik) #10

James

we are running locally the spark as standalone master so when spark driver able to connect what could be other reasons?