I'm working on some simple tests with Spark (standalone mode) and ElasticSearch. All my tests work fine as long as ElasticSearch is running in the same host as Spark. If I move ElasticSearch to a different node, then I get an exception "org.elasticsearch.hadoop.rest.EsHadoopTransportException: java.net.NoRouteToHostException: ..."
I have discarded any network / firewall issues, since I'm able to connect using kibana to the remote ElasticSearch node.
The test is as simple as this:
val spark = SparkSession.builder.master("local").appName("testing")
.config("es.nodes", "remotenode")
.getOrCreate()
spark.sparkContext.esRDD("radio/artists")
Console.out.println(RDD.count())
I'm using Spark 2.11 with elasticSearch-spark-20_2.11-5.3.2.
Yes, you should configure DNS if you plan to use hostnames as the problem is it can not resolve what is "es.nodes" = "remotenode" . The alternative is you can always use the IP address , e.g. "es.nodes" = "192.68.1.101:9200"
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.