How to reindex from es cluster1 to es cluster2 with Spark


(Jack Wang) #1

I have two ES clusters, I wanna reindex the data from cluster1 to cluster2, but I found I only can setup one SparkContext with one ES cluster, such as:
var sparkConf:SparkConf = new SparkConf().setAppName("EsReIndex")
sparkConf.set("es.nodes", “node1:9200")

So how can I implement the data reindex between two ES clusters.


(Ed) #2

how about just using Logstash ( An oldie but goodie)

https://www.elastic.co/guide/en/logstash/2.4/plugins-inputs-elasticsearch.html

input {
elasticsearch {
.....details for cluster 1
}
}
output{
eleasticsearch {
...... Details for cluster2
}
}

this is an old how we did it in the old 1.3 days , I know some people keep all their documents in Kafka, and by just resetting the topic offset you can replay the whole Topic.


(James Baiera) #3

@Jack_Wang Another option you could try is the ReIndex API in Elasticsearch.


(system) #4