I'm hitting the following when trying to save a Spark DataFrame to Elasticsearch
16/03/30 18:28:38 ERROR NetworkClient: Node [172.18.0.2:9200] failed (Connection timed out: connect); selected next node [10.123.45.67:9200]
I can ping Elastcisearch from the same machine the Spark app is running
nc -zv 10.123.45.67 9200_ Connection to 10.123.45.67 9200 port [tcp/*] succeeded!
My code
SparkConf conf = new SparkConf();
conf.setMaster("local[*]");
conf.setAppName("Analyser");
conf.set("es.index.auto.create", "true");
conf.set("es.nodes", "10.123.45.67:9200");
JavaStreamingContext jssc = new JavaStreamingContext(conf, Durations.seconds(1));
JavaDStream<String> lines = jssc.socketTextStream("localhost", 7654);
JavaDStream<String> eventLines = lines.filter((String line) -> line.contains("Event"));
eventLines.foreachRDD((JavaRDD<String> rdd) -> {
if (!rdd.isEmpty()) {
SQLContext sqlContext = SQLContext.getOrCreate(rdd.context());
DataFrame dataFrame = sqlContext.read().json(rdd);
dataFrame.registerTempTable("Events");
DataFrame resultDataFrame = sqlContext.sql("Select * from Events");
resultDataFrame.show(false);
JavaEsSparkSQL.saveToEs(resultDataFrame, "event/states", ImmutableMap.of("es.mapping.id", "eventId"));
}
});
jssc.start();
jssc.awaitTermination();
jssc.stop();
Full stacktrace from Spark app
https://gist.githubusercontent.com/dkirrane/8485d8d6f4c422310ec69d0e89271b35/raw/25f7ee5dc050b8cf88997bf2e9cd1d35d09a7112/45834.log