How to load 2 Million record of CSV file's data to Elasticsearch in CentosOs 7?

What is the best way to load 2 million records of CSV file data to elastic search?

Can I use spark for this purpose ?

maybe csv to json and then stream to es? https://github.com/elastic/stream2es

Have a look also at http://david.pilato.fr/blog/2015/04/28/exploring-capitaine-train-dataset/

Thank you but this solution take ages to run million records. However thank you at least its working for small amount of data

@ Jason Thank you. I couldn't make it work. Spark also has streaming ...
https://spark.apache.org/docs/latest/streaming-programming-guide.html

will this work for me in large amount of data ?

You can increase the number of workers of Logstash and set it to the number of CPU you have on your machine.

that's sad... or you can pay someone to do it for you :slight_smile: