Getting out of memory exception when full text index about 100Mb pdf file using elastic search

Hao · March 18, 2015, 10:02pm

I am using elastic search (ES 1.4.4, NEST client) performing documents full text index via REST API.

It works very well for small or medium size files eg MS office, pdf etc. But when I tried to full text index a pdf , which is more slightly larger than 75Mb, an out of memory was thrown out.

My question is : now we need to read the whole pdf file into an array and then covert it to base64string for content index. This as my understanding can cause out of memory very often when files are large.

Is it way to stream a file to ES to full text index? If it is not possible, are there any ways to full text index large files?

Thank you very much for your help.

Best Regards
Hao

Topic		Replies	Views
Index 350MB of file content using ingest attachment pipiline Elastic.Net throws System.OutOfMemory Exception Elasticsearch	4	602	December 16, 2020
Recommendation for indexing a large size document < 1G Elasticsearch	4	5756	July 5, 2017
Indexing a large pdf file (around 90MB) gives an exception Elasticsearch	4	1323	February 15, 2018
When indexing a VERY large text document (50 to 200MB) NEST throws WebException: The request was aborted: The request was canceled Elasticsearch	7	1803	October 25, 2017
High elastic search heap memory consumption while indexing huge files Elasticsearch	7	2007	September 20, 2017

Getting out of memory exception when full text index about 100Mb pdf file using elastic search

Related topics