What is the best way to load 2 million records of CSV file data to elastic search?
Can I use spark for this purpose ?
What is the best way to load 2 million records of CSV file data to elastic search?
Can I use spark for this purpose ?
maybe csv to json and then stream to es? https://github.com/elastic/stream2es
Have a look also at http://david.pilato.fr/blog/2015/04/28/exploring-capitaine-train-dataset/
Thank you but this solution take ages to run million records. However thank you at least its working for small amount of data
@ Jason Thank you. I couldn't make it work. Spark also has streaming ...
https://spark.apache.org/docs/latest/streaming-programming-guide.html
will this work for me in large amount of data ?
You can increase the number of workers of Logstash and set it to the number of CPU you have on your machine.
that's sad... or you can pay someone to do it for you
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.