How to load 2 Million record of CSV file's data to Elasticsearch in CentosOs 7?

anjithalk · June 4, 2015, 10:31am

What is the best way to load 2 million records of CSV file data to elastic search?

Can I use spark for this purpose ?

Jason_Wee · June 6, 2015, 10:36am

maybe csv to json and then stream to es? https://github.com/elastic/stream2es

dadoonet · June 6, 2015, 11:15am

anjithalk · June 11, 2015, 5:04am

Thank you but this solution take ages to run million records. However thank you at least its working for small amount of data

anjithalk · June 11, 2015, 5:08am

@ Jason Thank you. I couldn't make it work. Spark also has streaming ...
https://spark.apache.org/docs/latest/streaming-programming-guide.html

will this work for me in large amount of data ?

dadoonet · June 11, 2015, 5:22am

You can increase the number of workers of Logstash and set it to the number of CPU you have on your machine.

Jason_Wee · June 11, 2015, 11:46am

that's sad... or you can pay someone to do it for you

Topic		Replies	Views
Best method - Importing 50x10gb CSV files into Elasticsearch on GCE Elasticsearch	6	8920	July 6, 2017
How to load data from large parquet files of 500M records to ES using logstash Logstash	4	3791	January 30, 2019
Load data from parquet to elastic search Elasticsearch es-hadoop	3	3357	July 6, 2017
Bulk upload from csv to elasticsearch Elasticsearch	2	825	July 5, 2017
Logstash to ElasticSearch Throughput Logstash	6	1690	April 28, 2017