[elastic/elasticsearch] Cannot bulk index a JSON file greater than 100MB in Elasticsearch. Tried changing HTTP content length but it doesn't work

I am using Elasticsearch 6.3 latest version. I am not able to bulk index a JSON file with size greater than 100MB. when I run the below command to bulk index, there will be no error but the index will not be created. I tried modifying HTTP content_length, but doesn't work. Is there any way to store file size greater than 100MB?

Command which I run to bulk index:
curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"

A common guideline is to keep the size of each bill request to around 5MB in size. Sending larger bulk requests than that does not necessarily result in better performance.

Each bill request as in? what else is the way to store large amount of data in an index?

Are you storing lots of small documents or very large documents?

small documents

Then break it up into multiple smaller bulk requests. Bulk requests are used to send groups of documents in a single request but very rarely all documents.

So you trying to say i should bulk index files each less than 100mb? Like I indexed a file in which 51,000 small documents are there. It's size is 95MB. It gets bulk indexed. If i try to add another field in each 51,000 documents, the total file size exceeds 100MB and doesn't get indexed.

Try to keep the size of each request to around 5MB. If you use a tool like Logstash it will do this for you.

Each file size to 5MB or each document to 5MB?

If you have small documents you should try to keep the request size around that level.

It doesn't seem to be relevant to bulk index 5MB files again and again. There are many e-commerce websites who use elasticsearch. They definitely store GBs of data. There should be a way.

What does the bulk size have to do with how much data you store in the cluster??

Even if I just upload 1000 documents per bulk request, eventually the cluster will hold millions or billions of documents.

Okay. Thanks for your reply.

have you looked at " Loading Wikipedia's Search Index For Testing"

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.