Best practice for storing big (~500 MB) files in ES

I would like to store binary data chunks of up to ~500 MB in ES. These are not meant to be analyzed and used for searching. I merely want to store them and be able to retrieve them later. The reason for storing in ES is that the data is related to documents indexed in ES and I want to have everything in one place.

Now, I have figured that by using the binary data type (so that the data is not analyzed) and by increasing the http.max_content_length limit, I can upload and retrieve data chunks of sufficient size.

Other approach I have tried that didn't require messing up with http.max_content_length was to split the data into smaller chunks (10 MB each) and store them as multiple records in ES. This is a bit clunkier but it also worked.

I wonder if any of the two approaches is better than the other wrt performance or potential issues; or if there's any other best practice for storing big files in ES.

Thank you.

Elasticsearch is totally not designed for that. I'd love that support of blobs could happen but I don't think it will ever happen.
I'd probably collocate my blobs in another system like hdfs, couchdb, couchbase, MapR or whatever you prefer but storing 500mb of blobs in Elasticsearch is going to be a nightmare IMO.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.