Using ES Spark to copy data from one instance to another

(Ramdev Wudali) #1

Is it possible to use the Spark API to read an index from one ES Cluster and write the same data into a different ES Cluster ? If so, how can I do it ? basically how would the configuration look like and how can I get the RDD to line up with the different SparkContexts???

Thanks much


Multiple ES clusters in SparkSQL
(Costin Leau) #2

No, ES-Hadoop works only against the same cluster. You could try using tribe nodes however this is an unsupported scenario.

You can simply read the data and store it on disk/spark/hdfs/s3/etc.. and start another job to write it to the other ES cluster.

(system) #3