Is there a quicker way to import data to Elastic?

eeijlar · March 8, 2023, 1:33pm

I have exported elastic indices using logstash with the following logstash configuration:

    - pipeline.id: export-process
      pipeline.workers: 4
      config.string: |
        input {
          elasticsearch {
            hosts => "http://elastic:80/elasticsearch/"
            user => "elastic"
            password => ""
            ssl => "false"
            index => "metricbeat-*"
            docinfo => true
            query => '{
                "query": {
                  "bool": {
                    "filter": {
                      "range": {
                          "@timestamp": {
                          "gte": "now-35m",
                          "lte": "now",
                          "format": "strict_date_optional_time||epoch_millis"
                          }
                      }
                    }
                  }
              }
            }'
          }
        }
        output {
          file {
            gzip => "true"
            path => "/usr/share/logstash/export/export_%{[@metadata][_index]}.json.gz"
          }
        }

Now I am trying to import it back into another instance. I have unzipped the gz json file, and I am going over each line in the document and doing:

curl -s -XPOST http://1.2.3.4:9000/metricbeat/_doc/ -H "Content-Type: application/json" -d "$1"

where $1 is a line item from the json file. This method is very slow. I started the import of one index which is 1.7Gb and it is still running after 90 minutes. Is there a better way of doing this?

carly.richmond · March 8, 2023, 2:49pm

Hi John,

Are you able to use the _bulk API instead?

leandrojmp · March 8, 2023, 2:53pm

Why not use Logstash with a file input and an elasticsearch input? Or even Filebeat?

Also, if you have communication between your instances you could try a remote reindex, or maybe create a snapshot on a cloud service and restore from the snapshot.

eeijlar · March 8, 2023, 3:26pm

Hi @carly.richmond When I try the bulk import I get this in the response:

< Warning: 299 Elasticsearch-8.3.3-801fed82df74dbe537f89b71b098ccaff88d2c56 "Unsupported action: [stream]. Supported values are [create], [delete], [index], and [update]. Unsupported actions are currently accepted but will be rejected in a future version."
< content-type: application/json;charset=utf-8
< content-length: 329
* HTTP error before end of send, stop sending
<
* Closing connection 0
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [VALUE_STRING]"}],"type":"illegal_argument_exception","reason":"Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [VALUE_STRING]"},"status":400}

So it looks like the format of the output is not as expected. I also had to update the http.max_content_length as 100mb was too small also.
Thanks for the link!

eeijlar · March 8, 2023, 3:31pm

The instances I am exporting from are ephemeral, hence the reason for trying to harvest the data from them to be imported at a later date. I take it you mean elasticsearch output rather than input. That might be an option. I will try it out.

carly.richmond · March 8, 2023, 3:49pm

Yes, I think @leandrojmp's great suggestion is using file input and Elasticsearch output plugins as you've clarified. I would recommend trying his approach instead of bulk given the error above.

Let us know how you get on!

system · April 5, 2023, 3:50pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Migrating elasticsearch data Elasticsearch	8	692	September 25, 2019
To give input to ELasticsearch via logstash Logstash	5	1253	July 6, 2017
Bulk Import to Elasticsearch Elasticsearch	6	2074	December 5, 2017
How to import data to elasticsearch? Logstash	8	10983	July 6, 2017
Best method - Importing 50x10gb CSV files into Elasticsearch on GCE Elasticsearch	6	8938	July 6, 2017

Is there a quicker way to import data to Elastic?

Related topics