SystemMemoryOutOfException While Indexing Documents as attachments

ASN · May 27, 2016, 6:58am

Hi All,

I'm facing a problem while trying to index documents as attachments. (NEST and VS)
I am able to index documents of number 20-25 without any problem.
But if I have more documents say 50 odd then I'm facing a problem with indexing.
I'm using Index.Many to index the documents.

My Understanding:
According to what I understood, those 20 odd documents are of 5mb in total where as the remaining documents 30 odd or so, are of 80 mb in which one file is 50mb. I think its throwing the error when it is trying to index this big document.

But previously when i worked on the something similar, I was able to index this document too without any problem.

More description about the issue

KodrAus · May 27, 2016, 7:59am

That's a pretty big document, but shouldn't really be an issue...

I can think of a couple of things you can try:

Can you run the application outside of Visual Studio and see what it does
Can you run the application without indexing that single large document and see what it does
Can you look at your mem usage in Task Manager while running in Visual Studio and see how much memory it actually uses

If you're using Visual Studio 2015 you can also crack out the diagnostic tools window and see what it's up to.

EDIT: Looking at the code you put in the other thread it looks like the issue specifically is while serialising the content encoding... Is there any particular reason you're using Base64 and not UTF8 for the file data? But I'm not totally sure how much difference that makes...

ASN · May 27, 2016, 8:23am

Thanks for the reply Ashley.
Here are the observations you have asked for.

I'm assuming that you are asking me to run the .exe of the application and If it is that I did that with the document. It didn't throw any error(as you can see, I wrote the index call in try-catch block. It didnt come to catch block.) I had a success message after indexing. So when I ran the .exe, no error was found and the next success message was printed, but no files are indexed when i checked in the Head Plugin.

without the document and using the .exe of the application
I could index all the document without any problem and even this is reflected in HEAD plugin.

without that doc
I started with an available memory of ~958mb and when I run the application, it went to the lowest of ~330mb and once I close the application it took some time to close and then it released the memory back. After coming back to the normal position the available memory is ~800mb

with that doc
I started with an available memory of ~1.3Gband when I run the application, it went to the lowest of ~340mb , threw an error and once I close the application it closed and then it released the memory back. After coming back to the normal position the available memory is ~1.2Gb. But the cpu utilization was 100% at few times.

Please let me know if any other info is required.
TIA

KodrAus · May 27, 2016, 8:57am

That's weird... are you doing anything with the response object you get from IndexMany? Is it returning any errors to say the request didn't succeed?

And I have to say... Nearly 1Gb of RAM needed to index 30Mb of data is a pretty sad ratio (not your fault, that's just .NET for you)

ASN · May 27, 2016, 9:03am

No. I just created the variable thats it.
something like var response = client.IndexMany(list,"indexname").
It is throwing an error here everytime.
Sometimes it work perfectly, sometimes it wont. Confused . Dont know if it is my code problem or elasticsearch problem or as u said .NET problem().

Adding to this problem, I have another problem.

So this week, I'm completely stuck with these issues.

KodrAus · May 27, 2016, 9:20am

Nest doesn't always throw an exception when there's a problem with the request. So in the case where you get a success message but nothing is indexed I would check that response to make sure it actually succeeded.

There's some properties on there (I can't remember the name off the top of my head) that contains the actual response from Elasticsearch. You can write that to console to see what you're getting back when not running from Visual Studio.

How much memory does your app use when you run the exe without Visual Studio debugging? Should be much less...

ASN · May 27, 2016, 9:23am

Response has some properties like

I printed out calldetails and original exception details.
Below is the screenshot of that.

KodrAus · May 27, 2016, 9:31am

Does that error message happen reliably every time?

Maybe the socket is being disposed because of low memory...

ASN · May 27, 2016, 9:35am

With the Items in the response.
This time it didnt print the success message. Dont know why.

KodrAus · May 27, 2016, 9:38am

I'm thinking buffering all your files into memory and throwing them Elasticsearch at once isn't going so well.

Before doing anything else, do you want to adjust your program a bit to only read 1 file at a time? You could do this by grabbing the file metadata like you do, but instead of reading the contents of all files into the list, read 1 file, upload to Elasticsearch, dispose of the filestream, read the next file and so on.

Sorry I'd go and quickly tweak your sample code but I'm on my phone so can't edit code well

KodrAus · May 27, 2016, 9:58am

Yeh if it's not private then feel free to make a github gist or something with the sample and send a link.

I should get some time over the weekend to take a look

ASN · May 30, 2016, 2:30am

Hi Ashley,
While I'm indexing documents, I tried to print the response.

var response = client.IndexMany(list, "watcherfilesreader");
foreach (var item in response.Items)
                {
                    Console.WriteLine(item.ToString());
                }

It says 1 error, but displays nothing. Is it correctly indexed or indexed with error?

TIA

ASN · June 8, 2016, 1:27am

I'm still having the memoryoutofexception issues...I'm not sure if it is happening on VS side or elastic search side.

KodrAus · June 8, 2016, 6:08am

Hmmm, that would be on the VS side, it looks like memory is a bit of a sticking point on your setup, especially with that large document.

Both .NET and Elasticsearch are fairly free-spirited with memory and will take as much as they can get. Memory debugging can be a bit challenging, especially when the issue is that you're running out of it, but Visual Studio's memory diagnostic tools might be able to help you out.

Otherwise something more heavy-weight like ANTS Memory Profiler (has a free trial) could be worth looking into.

Topic		Replies	Views
SystemMemoryOutofException thrown while indexing files as an attachment Elasticsearch	40	3355	July 5, 2017
System.OutOfMemoryException - Nest Elasticsearch Elasticsearch	3	2297	July 5, 2017
Index 350MB of file content using ingest attachment pipiline Elastic.Net throws System.OutOfMemory Exception Elasticsearch	4	602	December 16, 2020
[indices:data/read/search[phase/fetch/id]]]; nested: ElasticsearchException[Java heap space]; nested: OutOfMemoryError[Java heap space] Elasticsearch	10	1978	June 4, 2017
ES OutOfMemoryError while indexing a large number of attachments Elasticsearch	9	492	July 6, 2017

SystemMemoryOutOfException While Indexing Documents as attachments

Related topics