Getting a "No data nodes with HTTP-enabled available" error when writing from Spark to elasticsearch on Google Dataproc

(Ben Weisburd) #1

I'm trying to export data from Spark => elasticsearch cluster running on Google Container Engine (GKE). I've deployed an ES cluster using configs from that create a couple of each node type: master, client, data.

I'm able to insert data into ES through the Spark connector if I have it connect to one of the Client nodes and set


After reading and
though, I'd like to have Spark write directly to the data nodes. However, if I switch back to the default es.nodes.client.only=false

I get this error:

	at No data nodes with HTTP-enabled available
	at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:58)
	at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:91)
	at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:91)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
	at org.apache.spark.executor.Executor$
	at java.util.concurrent.ThreadPoolExecutor.runWorker(
	at java.util.concurrent.ThreadPoolExecutor$

Error summary: EsHadoopIllegalArgumentException: No data nodes with HTTP-enabled available

(James Baiera) #2

I'm not too well versed in Docker, but taking a look through the configurations for the linked deployment I found these lines which are probably responsible for the lack of HTTP availability:

I also only see network configurations for transport level traffic (port 9300, non-http). I don't think this configuration is meant to be used in the manner that you are describing.

(Ben Weisburd) #3

Thanks, toggling that to "true" does fix it.

(Ben Weisburd) #4

I'm not sure how much of a difference this makes, but isn't it suboptimal that the Elasticsearch Spark connector communicates with elasticsearch using HTTP rather than the TCP protocol on 9300?

(James Baiera) #5

TCP protocol is not backwards compatible between versions of Elasticsearch the same way that HTTP is. There have also been a fair amount of benchmarks performed on HTTP vs RPC and they have found that the two have comparable performance characteristics.

(Ben Weisburd) #6

Got it. Thanks.

(ankara temizlik şirketleri) #7

Thank you for sharing!

ankara temizlik şirketleri

(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.