So I am using the Bulk indexing API for ES 1.6. If a OOM occurs, then on restart, the node's heap does not get cleared for some reason. Then, OOM's occur much faster. I am not using any other queries except to index my documents, and I am running a two node cluster with 9GB allocated to each machine, and I have to index about 21M documents.
If it's still OOMing then you likely have too much data in there, but more information will help. However how much data is in the cluster?
The cluster actually has like 50 GB worth of data, and I am indexing about 150 documents at a time. Each document is about 4KB on average.
Then that's very odd.
If you start ES and then check _cat/fielddata
, what does it report? Also, what's in your logs?
It tells me that there are 0b allocated for the fielddata. Is that supposed to happen? I am using ElasticHQ for monitoring the cluster, and it shows me that about 80% of the heap is occupied on node restart as well.
Are you using parent/child or nesting?