Store binary files in elastic search

Arpita11 · January 31, 2022, 8:02am

For my project, we are storing the data in Elasticsearch.
I have new requirement regarding documents (i.e. the actual binaries):

Store documents and be able to search the text within the document binary.
Allow the user to download the document after I display the search results.

I came across few discussions mentioning Elasticsearch is not designed to store big BLOBs.
Need suggestion if I should store data in Elasticsearch or a file system.

warkolm · January 31, 2022, 8:26am

Welcome to our community!

If you decode the blob's then index them Elasticsearch can search in them. Otherwise it's not going to be worth it. You would then store the metadata in there to search on that.

dadoonet · January 31, 2022, 8:32am

Take a look at FSCrawler project as well. It could help you.

Arpita11 · January 31, 2022, 8:33am

Thank you for the quick response.

Can we possibly run into performance issues later due to big file sizes?

warkolm · January 31, 2022, 8:39am

How big?

Christian_Dahlqvist · January 31, 2022, 8:45am

Storing large binary objects in Elasticsearch is not recommended. Instead store the extracted and indexed text together with a location of the binary object, e.g. on S3, so you can retrieve it from there instead when needed.

Arpita11 · January 31, 2022, 8:48am

This requirement is to build up a knowledge exchange site with images, word, pdf, etc. Number of files will increase over time. Upto 10MB is expected file size as of now.

warkolm · January 31, 2022, 9:00am

Maybe you should look at Elastic Workplace Search | Elastic then.

Arpita11 · January 31, 2022, 11:36am

My application is on .Net Core.
I am also going through this blog for integration using Nest - The Future of Attachments for Elasticsearch and .NET | Elastic Blog.
This is based on usage of "ingest attachment processor plugin".

Could you please suggest if this approach will scale with the increase in number of attachments.

system · February 28, 2022, 11:37am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Storing binary files in Elastic Elasticsearch	8	20293	July 5, 2017
Store Documents in Elastic Search with Edit and Search options Elasticsearch	2	340	March 20, 2020
What would be the best way to store and query large binary files in ES? Elasticsearch	1	612	July 5, 2017
Storing binary data in ES Elasticsearch	1	1133	July 10, 2019
Big binary fields/files storage in ElasticSearch Elasticsearch	3	7633	July 5, 2017

Store binary files in elastic search

Related topics