Migrating elasticsearch data

Hey Everyone,

I have this situation that I have a disconnected elasticsearch[1] environment and I need to transfer the data to another elasticsearch[2] cluster.
I was thinking to export all of the data from [1] to a Json file (with log stash pipeline) and transfer the Json file to the [2] cluster, then using the bulk api for importing the indices.

Is there any better way to perform this operation given that the clusters[1]+[2] are not connected via the internet?

hi Shahaf
A more fast way is SnapShot/Restore.

Hey @wangqinghuan
I did not mention that I am not allowed to change elasticsearch configuration (for adding "path") to the elasticsearch.yml.
Therefor I am trying to export elasticsearch indices to a Json file and import it with bulk api to another cluster.

This is my logstash pipeline:
input {
elasticsearch {
hosts => "localhost:9200"
index => "*"
docinfo => true
size => 10000

output {
file {
path => "/var/log/logstash/index.json"
codec => json_lines

After getting the Json file, the bulk api does not seems to work for the second cluster, am I doing anything wrong?

Bulk api:
curl -v -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/_bulk?pretty' --data-binary @/var/log/logstash/index.json

Does the index.json contain action_and_meta_data lines? A standard JSON structure that bulk API expects:

{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }

You could use elasticdump: https://www.npmjs.com/package/elasticdump

This is probably my issue, my file does not have the first line:
{ "index" : { "_index" : "test", "_id" : "1" } }

How can I export all of my indices to one file with the structured you mentioned using logstash pipeline?

This way requires from me configuration changes which in my case is not an option.

I don't know how to export meta_data line with logstash pipeline. However, you can use logstash to import index.json into your new Elasticsearch cluster.

I found this python script

#!/usr/bin/env python3
filepath = '<PATH TO JSON FILE>'
metadata='{ "index": { "_index": "INDEX_NAME", "_type": "_doc" }}'
with open(filepath, mode="r",encoding="utf-8") as my_file:
    for line in my_file:

And I ran it against the exported Json I got from the first Elasticsearch cluster, this script added the missing action_and_meta_data lines, then I could use the bulk api to push it to the second elasticsearch cluster.

If there is a way to export the data in the following format:

{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }

it will be the preferred way.

Anyway, thanks you for your help @wangqinghuan !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.