Storing binary files in Elastic

Note that mapper attachments plugin is deprecated and is replaced by ingest-attachment.
This later plugin "only" extracts text from your binary doc without storing it in elasticsearch.

Storing binary documents is not ideal. Imagine that you store a MP4 movie in a Lucene segment (well 4gb-10gb), it does not really make sense. Elasticsearch has not been designed for that purpose.
I like in such a case using another BLOB storage:

  • HDFS
  • CouchDB
  • S3
    ...

And just index the content in elasticsearch with a URL to the source blob.

I hope it makes sense.

1 Like