I have a document to be indexed to elastic search , it is 200 MB file ,so i want to use parallel bulk..
file is in this format. [{},{},{},{}]
basically it is an array of objects.
but when i try to index using parallel bulk, nothing is being indexed to elastic search.
how do i index data using parallel bulk?
Do i need to format the data in any format before i use parallel bulk, if yes please specify the format.
The bulk interface is for sending multiple documents (not necessarily all) in a single request and has to follow the format described in the documentation. It is recommended that bulk requests are limited to around 5MB in size so you should break your data up into multiple requests.
It would have helped if you explained that you are using the python client in the forst post. Can you show us your code? Are you parsing the input file and treating the objects one by one in the code?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.