OutOfMemoryError from field cache

I'm trying to debug OutOfMemory errors that keep happening in our
elastic cluster. I've been monitoring statistics once a minute before
the error occurs. The problem seems to be from the field data cache.
I've seen other discussions about the need for the field data cache
with facet and sort searches but this cluster is only being used for
indexing right now. While monitoring the statistics I see the query
count go up every time the problem happens. I was wondering if
elasticsearch was querying itself in order to warm the field data
cache. Does it do this or does it wait until a real query comes in?

Statistics at one minute.
"search" : {
"query_total" : 0,
"query_time" : "0s",
"query_time_in_millis" : 0,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time" : "0s",
"fetch_time_in_millis" : 0,
"fetch_current" : 0
},
"cache" : {
"field_evictions" : 0,
"field_size" : "0b",
"field_size_in_bytes" : 0,
"filter_count" : 0,
"filter_evictions" : 0,
"filter_size" : "0b",
"filter_size_in_bytes" : 0
},

The next minute. We're configured with 20 gigs for each node so this
quickly causes the OutOfMemoryError.
"search" : {
"query_total" : 1,
"query_time" : "16.5s",
"query_time_in_millis" : 16543,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time" : "0s",
"fetch_time_in_millis" : 0,
"fetch_current" : 0
},
"cache" : {
"field_evictions" : 0,
"field_size" : "13.4gb",
"field_size_in_bytes" : 14475278126,
"filter_count" : 3,
"filter_evictions" : 0,
"filter_size" : "26.7mb",
"filter_size_in_bytes" : 28089832
},

I have the same problem.

Today, I have a million of documents and 4 million document tag when I
do in every facet documents, I have a problem with OutOfMemory. I
started the JVM with 4GB.

2012/1/8 Jason jason@element84.com:

I'm trying to debug OutOfMemory errors that keep happening in our
elastic cluster. I've been monitoring statistics once a minute before
the error occurs. The problem seems to be from the field data cache.
I've seen other discussions about the need for the field data cache
with facet and sort searches but this cluster is only being used for
indexing right now. While monitoring the statistics I see the query
count go up every time the problem happens. I was wondering if
elasticsearch was querying itself in order to warm the field data
cache. Does it do this or does it wait until a real query comes in?

Statistics at one minute.
"search" : {
"query_total" : 0,
"query_time" : "0s",
"query_time_in_millis" : 0,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time" : "0s",
"fetch_time_in_millis" : 0,
"fetch_current" : 0
},
"cache" : {
"field_evictions" : 0,
"field_size" : "0b",
"field_size_in_bytes" : 0,
"filter_count" : 0,
"filter_evictions" : 0,
"filter_size" : "0b",
"filter_size_in_bytes" : 0
},

The next minute. We're configured with 20 gigs for each node so this
quickly causes the OutOfMemoryError.
"search" : {
"query_total" : 1,
"query_time" : "16.5s",
"query_time_in_millis" : 16543,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time" : "0s",
"fetch_time_in_millis" : 0,
"fetch_current" : 0
},
"cache" : {
"field_evictions" : 0,
"field_size" : "13.4gb",
"field_size_in_bytes" : 14475278126,
"filter_count" : 3,
"filter_evictions" : 0,
"filter_size" : "26.7mb",
"filter_size_in_bytes" : 28089832
},

--
Gustavo Maia

I added "index.cache.field.type: soft" to our elastic config and this
has fixed the problem. Search performance isn't any worse. We don't
use facet search over the majority of our documents and we only sort
on two fields. My theory is that the resident cache is putting more
than the two sort fields in memory or all of the values in the two
sort fields are more than we have memory to accomodate. The fields
we're sorting on are an id field and a date field. The id field only
a dozen unique values across 100 million records. I want to try and
calculate how much memory the fields must be taking. I could probably
try to determine this by starting with an empty index and adding 10
documents then execute a query with sorting that will find all of
them. Then I could check the statistics to see how much memory is
allocated to the field cache.

On Jan 8, 10:58 am, Gustavo Maia gustavobbm...@gmail.com wrote:

I have the same problem.

Today, I have a million of documents and 4 million document tag when I
do in every facet documents, I have a problem with OutOfMemory. I
started the JVM with 4GB.

2012/1/8 Jason ja...@element84.com:

I'm trying to debug OutOfMemory errors that keep happening in our
elastic cluster. I've been monitoring statistics once a minute before
the error occurs. The problem seems to be from the field data cache.
I've seen other discussions about the need for the field data cache
with facet and sort searches but this cluster is only being used for
indexing right now. While monitoring the statistics I see the query
count go up every time the problem happens. I was wondering if
elasticsearch was querying itself in order to warm the field data
cache. Does it do this or does it wait until a real query comes in?

Statistics at one minute.
"search" : {
"query_total" : 0,
"query_time" : "0s",
"query_time_in_millis" : 0,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time" : "0s",
"fetch_time_in_millis" : 0,
"fetch_current" : 0
},
"cache" : {
"field_evictions" : 0,
"field_size" : "0b",
"field_size_in_bytes" : 0,
"filter_count" : 0,
"filter_evictions" : 0,
"filter_size" : "0b",
"filter_size_in_bytes" : 0
},

The next minute. We're configured with 20 gigs for each node so this
quickly causes the OutOfMemoryError.
"search" : {
"query_total" : 1,
"query_time" : "16.5s",
"query_time_in_millis" : 16543,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time" : "0s",
"fetch_time_in_millis" : 0,
"fetch_current" : 0
},
"cache" : {
"field_evictions" : 0,
"field_size" : "13.4gb",
"field_size_in_bytes" : 14475278126,
"filter_count" : 3,
"filter_evictions" : 0,
"filter_size" : "26.7mb",
"filter_size_in_bytes" : 28089832
},

--
Gustavo Maia