How can I save documents like docx, pdf, xls and other microsoft documents in elasticsearch


(Agwani) #1

Is their any way to add microsoft documents to elasticsearch and do the search on the documnets


(David Pilato) #3

ingest-attachment plugin is built for that.

Have also at FSCrawler project in case it helps.


(Ambar) #4

Hello! If you want quick and dirty solution then as @dadoonet says 'ingest-attachment' is best solution for you. But in some cases it will not work

The basic issues are:

* The overhead when converting binary stream to BASE64 is about 30%
* Storing source files data in ES is waaay too expensive and useless, even if you store it separately and exclude it from the index (especially in BASE64 string format)

Read this post to get more details https://blog.ambar.cloud/ingest-attachment-plugin-for-elasticsearch-should-you-use-it/


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.