GC failing to reduce heap memory usage

Hi all,

I've a situation where ES fails to reduce the heap usage in ES. When this
happens, the logs say something like http://pb.abhijeetr.com/OaaB

Indexing hangs, CPU which generally is at around 300% gets stuck around
100% with nothing happening.

To give you idea about setup, it's a 2 node cluster (2GHz 8 threads, 48GB
RAM) with 3TB of data with each node containin 3.3 billion documents. Each
doc has around 10 fields.

This issue happens only when my data increases more than a certain limit
(around 1.5TB). Is the data too much to be handled by these two nodes? When
I clear caches for indices, everything seems to start working again, so
it's because of cache only. The real question is, why can't GC clear that
cache when ES really needs it for other stuff? I also have
"index.cache.field.type: soft" set in my elasticsearch.yml. What is that I
can do to fix the problem?

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

I'm guessing your heap is too small? 3.3B docs on 2 machines with 48 GB RAM
is a lot of docs. Are you doing any searching or only indexing? Have a
look at various ES metrics in SPM.... including the size of that cache you
mention over time and your GC pattern. That may shed some light...

Otis

ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html

On Monday, April 15, 2013 1:20:01 PM UTC-4, Abhijeet Rastogi wrote:

Hi all,

I've a situation where ES fails to reduce the heap usage in ES. When this
happens, the logs say something like http://pb.abhijeetr.com/OaaB

Indexing hangs, CPU which generally is at around 300% gets stuck around
100% with nothing happening.

To give you idea about setup, it's a 2 node cluster (2GHz 8 threads, 48GB
RAM) with 3TB of data with each node containin 3.3 billion documents. Each
doc has around 10 fields.

This issue happens only when my data increases more than a certain limit
(around 1.5TB). Is the data too much to be handled by these two nodes? When
I clear caches for indices, everything seems to start working again, so
it's because of cache only. The real question is, why can't GC clear that
cache when ES really needs it for other stuff? I also have
"index.cache.field.type: soft" set in my elasticsearch.yml. What is that I
can do to fix the problem?

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

It's actually 6.6 billion docs in total. RIght now, I keep clearing caches
just to make the cluster running. Any idea how much RAM can I have per box
without causing unnecessary long GC pauses or any other issues?

How can I get info about different kind of caches in ES?

On Tue, Apr 16, 2013 at 9:02 AM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

Hi,

I'm guessing your heap is too small? 3.3B docs on 2 machines with 48 GB
RAM is a lot of docs. Are you doing any searching or only indexing? Have
a look at various ES metrics in SPM.... including the size of that cache
you mention over time and your GC pattern. That may shed some light...

Otis

ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html

On Monday, April 15, 2013 1:20:01 PM UTC-4, Abhijeet Rastogi wrote:

Hi all,

I've a situation where ES fails to reduce the heap usage in ES. When this
happens, the logs say something like http://pb.abhijeetr.com/OaaB

Indexing hangs, CPU which generally is at around 300% gets stuck around
100% with nothing happening.

To give you idea about setup, it's a 2 node cluster (2GHz 8 threads, 48GB
RAM) with 3TB of data with each node containin 3.3 billion documents. Each
doc has around 10 fields.

This issue happens only when my data increases more than a certain limit
(around 1.5TB). Is the data too much to be handled by these two nodes? When
I clear caches for indices, everything seems to start working again, so
it's because of cache only. The real question is, why can't GC clear that
cache when ES really needs it for other stuff? I also have
"index.cache.field.type: soft" set in my elasticsearch.yml. What is that I
can do to fix the problem?

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I could get those stats via stats API but what I'm missing is to get the
cache size per index. Is that even possible?

On Tue, Apr 16, 2013 at 12:13 PM, Abhijeet Rastogi
abhijeet.1989@gmail.comwrote:

It's actually 6.6 billion docs in total. RIght now, I keep clearing caches
just to make the cluster running. Any idea how much RAM can I have per box
without causing unnecessary long GC pauses or any other issues?

How can I get info about different kind of caches in ES?

On Tue, Apr 16, 2013 at 9:02 AM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

Hi,

I'm guessing your heap is too small? 3.3B docs on 2 machines with 48 GB
RAM is a lot of docs. Are you doing any searching or only indexing? Have
a look at various ES metrics in SPM.... including the size of that cache
you mention over time and your GC pattern. That may shed some light...

Otis

ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html

On Monday, April 15, 2013 1:20:01 PM UTC-4, Abhijeet Rastogi wrote:

Hi all,

I've a situation where ES fails to reduce the heap usage in ES. When
this happens, the logs say something like http://pb.abhijeetr.com/OaaB

Indexing hangs, CPU which generally is at around 300% gets stuck around
100% with nothing happening.

To give you idea about setup, it's a 2 node cluster (2GHz 8 threads,
48GB RAM) with 3TB of data with each node containin 3.3 billion documents.
Each doc has around 10 fields.

This issue happens only when my data increases more than a certain limit
(around 1.5TB). Is the data too much to be handled by these two nodes? When
I clear caches for indices, everything seems to start working again, so
it's because of cache only. The real question is, why can't GC clear that
cache when ES really needs it for other stuff? I also have
"index.cache.field.type: soft" set in my elasticsearch.yml. What is that I
can do to fix the problem?

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You don't describe the kind of query you execute, so it is hard to give
helpful advice. I assume you use facets or query filters. It depends on
the queries, the number of fields, the cardinality of the values in the
fields, not necessarily the mere data volume or the number of docs.

As a general note, you can't expect that standard CMS GC scales well -
it was designed many years ago for heaps under 8 GB. If you want to
ensure GC runs with low latency on large heaps, consider switching to
the more responsive G1 GC. But nevertheless, using "soft" for cache
field is kind of random strategy. It is just unpredictable how much of
your heap will be used or not. And of course you can always improve the
situation by simply adding more nodes.

Jörg

Am 15.04.13 19:20, schrieb Abhijeet Rastogi:

Hi all,

I've a situation where ES fails to reduce the heap usage in ES. When
this happens, the logs say something like http://pb.abhijeetr.com/OaaB

Indexing hangs, CPU which generally is at around 300% gets stuck
around 100% with nothing happening.

To give you idea about setup, it's a 2 node cluster (2GHz 8 threads,
48GB RAM) with 3TB of data with each node containin 3.3 billion
documents. Each doc has around 10 fields.

This issue happens only when my data increases more than a certain
limit (around 1.5TB). Is the data too much to be handled by these two
nodes? When I clear caches for indices, everything seems to start
working again, so it's because of cache only. The real question is,
why can't GC clear that cache when ES really needs it for other stuff?
I also have "index.cache.field.type: soft" set in my
elasticsearch.yml. What is that I can do to fix the problem?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I'm using Kibana to search logs and randomly execute some facets query for
testing. Kibana uses filters mostly on date fields to get logs.

Using logstash, it creates a new index everyday. So, I did some testing by
clearing cache for all indices except the latest 7 & executed a query that
spans across 7 indices. Then, I noted the field_cache and it was around
11GB & total heap used was 22GB. Filter cache was around 3GB.

@Jorg, is it like ES always rebuils the field/filter cache before
processing any query? If that is the case, I can use the above method to
predict the amount of RAM used by a single index.

On Tue, Apr 16, 2013 at 10:55 PM, Jörg Prante joergprante@gmail.com wrote:

You don't describe the kind of query you execute, so it is hard to give
helpful advice. I assume you use facets or query filters. It depends on the
queries, the number of fields, the cardinality of the values in the fields,
not necessarily the mere data volume or the number of docs.

As a general note, you can't expect that standard CMS GC scales well - it
was designed many years ago for heaps under 8 GB. If you want to ensure GC
runs with low latency on large heaps, consider switching to the more
responsive G1 GC. But nevertheless, using "soft" for cache field is kind of
random strategy. It is just unpredictable how much of your heap will be
used or not. And of course you can always improve the situation by simply
adding more nodes.

Jörg

Am 15.04.13 19:20, schrieb Abhijeet Rastogi:

Hi all,

I've a situation where ES fails to reduce the heap usage in ES. When this
happens, the logs say something like http://pb.abhijeetr.com/OaaB

Indexing hangs, CPU which generally is at around 300% gets stuck around
100% with nothing happening.

To give you idea about setup, it's a 2 node cluster (2GHz 8 threads, 48GB
RAM) with 3TB of data with each node containin 3.3 billion documents. Each
doc has around 10 fields.

This issue happens only when my data increases more than a certain limit
(around 1.5TB). Is the data too much to be handled by these two nodes? When
I clear caches for indices, everything seems to start working again, so
it's because of cache only. The real question is, why can't GC clear that
cache when ES really needs it for other stuff? I also have
"index.cache.field.type: soft" set in my elasticsearch.yml. What is that I
can do to fix the problem?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@**googlegroups.comelasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

The field/filter cache is reused across queries, as much as possible
(otherwise it would not make much sense).

Jörg

Am 17.04.13 09:06, schrieb Abhijeet Rastogi:

@Jorg, is it like ES always rebuils the field/filter cache before
processing any query?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yeah, that seems fair. What I actually wanted to know was, is it must for
filter/field cache to be present before processing the query?

I mean, can a query be executed in ES without field/filter cache being
present?

On Wed, Apr 17, 2013 at 2:47 PM, Jörg Prante joergprante@gmail.com wrote:

The field/filter cache is reused across queries, as much as possible
(otherwise it would not make much sense).

Jörg

Am 17.04.13 09:06, schrieb Abhijeet Rastogi:

@Jorg, is it like ES always rebuils the field/filter cache before

processing any query?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@**googlegroups.comelasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You have control about filters being cached. ES helps you with
reasonable default settings which can be modified. From the guide at
http://www.elasticsearch.org/guide/reference/query-dsl/

"Some filters already produce a result that is easily cacheable, and the
difference between caching and not caching them is the act of placing
the result in the cache or not. These filters, which include the term,
terms, prefix, and range filters, are by default cached...

"Other filters, usually already working with the field data loaded into
memory, are not cached by default..."

Queries can be executed without cache. Obviously, if you don't specifiy
filters in your query, but instead specify them as query, there is no
filter caching.

Jörg

Am 17.04.13 12:21, schrieb Abhijeet Rastogi:

Yeah, that seems fair. What I actually wanted to know was, is it must
for filter/field cache to be present before processing the query?

I mean, can a query be executed in ES without field/filter cache being
present?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hello,

You can also try tweaking the amount of memory you allow Elasticsearch to
use for caches. For field caches you can have something like:
index.fielddata.cache: node
indices.fielddata.cache.size: 10%

And for filter caches:
indices.cache.filter.size: 10%

As far as I know, the field cache settings require 0.90 and implies that
you remove "index.cache.field.type: soft".

Best regards,
Radu

On Wed, Apr 17, 2013 at 9:49 PM, Jörg Prante joergprante@gmail.com wrote:

You have control about filters being cached. ES helps you with reasonable
default settings which can be modified. From the guide at
http://www.elasticsearch.org/**guide/reference/query-dsl/http://www.elasticsearch.org/guide/reference/query-dsl/

"Some filters already produce a result that is easily cacheable, and the
difference between caching and not caching them is the act of placing the
result in the cache or not. These filters, which include the term, terms,
prefix, and range filters, are by default cached...

"Other filters, usually already working with the field data loaded into
memory, are not cached by default..."

Queries can be executed without cache. Obviously, if you don't specifiy
filters in your query, but instead specify them as query, there is no
filter caching.

Jörg

Am 17.04.13 12:21, schrieb Abhijeet Rastogi:

Yeah, that seems fair. What I actually wanted to know was, is it must for

filter/field cache to be present before processing the query?

I mean, can a query be executed in ES without field/filter cache being
present?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@**googlegroups.comelasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
http://sematext.com/ -- ElasticSearch -- Solr -- Lucene

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.