Using the Bulk Indexing API, if my node crashes, my elasticsearch heap memory does not get freed

(Sahil Chelaramani) #1

So I am using the Bulk indexing API for ES 1.6. If a OOM occurs, then on restart, the node's heap does not get cleared for some reason. Then, OOM's occur much faster. I am not using any other queries except to index my documents, and I am running a two node cluster with 9GB allocated to each machine, and I have to index about 21M documents.

(Mark Walkom) #2

If it's still OOMing then you likely have too much data in there, but more information will help. However how much data is in the cluster?

(Sahil Chelaramani) #3

The cluster actually has like 50 GB worth of data, and I am indexing about 150 documents at a time. Each document is about 4KB on average.

(Mark Walkom) #4

Then that's very odd.

If you start ES and then check _cat/fielddata, what does it report? Also, what's in your logs?

(Sahil Chelaramani) #5

It tells me that there are 0b allocated for the fielddata. Is that supposed to happen? I am using ElasticHQ for monitoring the cluster, and it shows me that about 80% of the heap is occupied on node restart as well.

(Mark Walkom) #6

Are you using parent/child or nesting?

(system) #7