Anecdotal Advice on memory budget / OOM recovery mechanism for approx 2 billion log lines?

This is my setup:

-Running ES 0.90.2
-5 ES nodes, each with 24GB of RAM, 16GB to ES
-soft field cache
-7 indices, one for each day of the week, totaling around 2 billion log
lines
-doing some amount of faceting / sorting because I am using Kibana for a
front end
-log lines are generally small (< a few K), occasionally there will be some
larger entries (maybe 100K)

I am getting java heap out of memory exceptions as I approach 2 billion
lines.

I have three questions:

  1. It's my understanding that by setting the field cache to 'soft' the
    field cache entries prevents the field cache itself from causing out of
    memory issue because the GC always has the option to collect the soft
    references. Is this correct?

  2. Does anyone have anecdotal advice for the budgets they use, assuming
    they use a similar setup?

  3. Obviously avoiding OOM is the ideal, but in case that it does happens
    what's the easiest way to automatically recover? The best I have is polling
    the status API and if it timesout or goes red, then restarting all of the
    ES instances.

Thanks in advance,
~joe!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.