Bulk insert to Elastic from Logstash "bulk_path"

baselai · August 22, 2019, 2:47pm

I have a huge csv file, like 500k rows, that I'm processing and want to push to elastic

This output:

elasticsearch {
        index => "%{pipelineId}"
        hosts => ["${domain:port}/"]
    }

outputs one document at a time! it takes like 11-14 mins.
Here is what I'm doing to do bulk but it takes the same time as above!

elasticsearch {
        index => "%{pipelineId}"
        hosts => ["${domain:port}/"]
        bulk_path => "${domain:port}/_bulk"
    }

any ideas why?

Badger · August 22, 2019, 6:12pm

What makes you think the elasticsearch output is sending one document at a time?

baselai · August 22, 2019, 6:17pm

@Badger I'm outputting on the terminal every time it sends to elasticsearch

    stdout { codec => rubydebug }
    file {
        path => "/usr/share/output/output.json"
        codec => "json_lines"
    }

and also the time, 11-13 mins for 500k json document pushed to ES

Badger · August 22, 2019, 6:45pm

The elasticsearch output uses the bulk API to load data. I haven't checked the code, but I would expect it to load each pipeline batch (by default 125 events) as a single API call.

It would not surprise me if rubydebug is your rate-limiting component. How long does it take if you remove the non-elasticsearch outputs.

baselai · August 22, 2019, 6:57pm

I just ran it, with 854430 docs pushed to ES, it took 9 minutes.

Christian_Dahlqvist · August 22, 2019, 7:42pm

What is the size of your documents? What is the size and specification of your Elasticsearch cluster?

system · September 19, 2019, 7:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Regarding Bulk Indexing Requests Elasticsearch	17	214	April 10, 2024
Logstash Bulk Indexing Logstash	7	4904	November 22, 2018
Optimizing ES settings for bulkinserts Elasticsearch	3	455	July 6, 2017
Elasticsearch Bulk Write is slow using Scan and Scroll Elasticsearch	4	899	July 5, 2017
Is there a quicker way to import data to Elastic? Elasticsearch	6	937	April 5, 2023

Bulk insert to Elastic from Logstash "bulk_path"

Related topics