Help with OOM and configuration

Hi,

I would like to ask for hints (and help, if possible) on improving the
functioning of a single ES node I'm responsible of maintaining. I am fairly
new to ES, and trying to get a grasp of all the concepts.

What I have set up is an AWS machine with 4 cores, 15GB RAM (m3.xlarge),
two storage drives, with ES 0.90.9, Heap size of 12GB, Java Hotspot 1.7.60.
ES is basically the only process running on the machine aside from the
system ones. Currently, we have about 550 million documents entered, and
we're entering approx. 1000 per second. CPU is at about 40%, storage taken
about 900GB. When it works, it works well enough.

The problem is that the heap is always at minimum 85%, and keeps raising.
I'm aware that, compared to the RAM size, the heap is perhaps too high, but
I had no choice, as the previous settings weren't sufficient. Now and then,
the usage goes too high, logs get filled with 'java heap space' errors, the
process is present, but nothing happens. A restart helps, and reduces the
heap usage to about 50%, which slowly gets to 85%, and goes in circles.

Here's an extract of the info:

"mem" : {
"heap_used" : "10.3gb",
"heap_used_in_bytes" : 11102091448,
"heap_used_percent" : 86,
"heap_committed" : "11.9gb",
"heap_committed_in_bytes" : 12850036736,
"heap_max" : "11.9gb",
"heap_max_in_bytes" : 12850036736,
"non_heap_used" : "43.6mb",
"non_heap_used_in_bytes" : 45803280,
"non_heap_committed" : "66.3mb",
"non_heap_committed_in_bytes" : 69550080,
"pools" : {
"Code Cache" : {
"used" : "10.3mb",
"used_in_bytes" : 10896576,
"max" : "48mb",
"max_in_bytes" : 50331648,
"peak_used" : "10.4mb",
"peak_used_in_bytes" : 10981184,
"peak_max" : "48mb",
"peak_max_in_bytes" : 50331648
},
"Par Eden Space" : {
"used" : "46mb",
"used_in_bytes" : 48263968,
"max" : "266.2mb",
"max_in_bytes" : 279183360,
"peak_used" : "266.2mb",
"peak_used_in_bytes" : 279183360,
"peak_max" : "266.2mb",
"peak_max_in_bytes" : 279183360
},
"Par Survivor Space" : {
"used" : "15.9mb",
"used_in_bytes" : 16726464,
"max" : "33.2mb",
"max_in_bytes" : 34865152,
"peak_used" : "33.2mb",
"peak_used_in_bytes" : 34865152,
"peak_max" : "33.2mb",
"peak_max_in_bytes" : 34865152
},
"CMS Old Gen" : {
"used" : "10.2gb",
"used_in_bytes" : 11037101016,
"max" : "11.6gb",
"max_in_bytes" : 12535988224,
"peak_used" : "10.4gb",
"peak_used_in_bytes" : 11266814432,
"peak_max" : "11.6gb",
"peak_max_in_bytes" : 12535988224
},
"CMS Perm Gen" : {
"used" : "33.2mb",
"used_in_bytes" : 34906704,
"max" : "82mb",
"max_in_bytes" : 85983232,
"peak_used" : "33.2mb",
"peak_used_in_bytes" : 34906704,
"peak_max" : "82mb",
"peak_max_in_bytes" : 85983232
}
}

I did some reading, and found suggestions of setting field cache expiry
time, so I set it to 1 minute, with no significant change. Other settings
are mostly default.

I'd appreciate any hints and tips. Is it that we've simply hit the limit
with the stored data which occupies the available heap, or is it some
configuration option I missed? If a new, stronger machine is needed,
because there's no other solution, that is something I can work with, but
if some setting can help with making it work, I'd be happier to go with
that.

Thanks in advance for your time. I know this is probably a newbie question,
but...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ab8815c8-3765-4a8a-8f00-99f83125a0fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.