Is it possible perform bulk insert from Spark to ElasticSearch?
At the moment, I'm using the 'saveToEsWithMeta' method for upserting the data(JavaPairRDD). Is there a way to bulk insert using the _bulk API? Are there any example that I could take a look?
All the writes in Elasticsearch-Hadoop (including Spark) are done using the
bulk API underneath (through the REST protocol and thus use the _bulk
endpoint). Whether you saving 1, 100 or 10K, the procedure is the same.
Btw, I recommend spending some time reading the whole reference
documentation as it covers the architecture pretty well and provides plenty
of examples.
@diplomaticguru Why are you looking at the source and not the official, rendered doc which is available here? The docs are mentioned in the Github readme and on the project homepage.
How did you come across es-hadoop ? It's an honest question since it looks like the reference documentation was not advertised enough and I'd like to address that.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.