I am using Elasticsearch 6.3 latest version. I am not able to bulk index a JSON file with size greater than 100MB. when I run the below command to bulk index, there will be no error but the index will not be created. I tried modifying HTTP content_length, but doesn't work. Is there any way to store file size greater than 100MB?
Command which I run to bulk index:
curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"
A common guideline is to keep the size of each bill request to around 5MB in size. Sending larger bulk requests than that does not necessarily result in better performance.
Then break it up into multiple smaller bulk requests. Bulk requests are used to send groups of documents in a single request but very rarely all documents.
So you trying to say i should bulk index files each less than 100mb? Like I indexed a file in which 51,000 small documents are there. It's size is 95MB. It gets bulk indexed. If i try to add another field in each 51,000 documents, the total file size exceeds 100MB and doesn't get indexed.
It doesn't seem to be relevant to bulk index 5MB files again and again. There are many e-commerce websites who use elasticsearch. They definitely store GBs of data. There should be a way.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.