Indices.cache.filter.size limit not enforce?

Hello,

Yesterday I had the same problem as last time (
https://groups.google.com/d/msg/elasticsearch/PQhJezsaQrg/w8_KmhJ0wcQJ).

But this time I have a suspicion about indices.cache.filter.size limit that
might not be enforced.

The story is that filter_cache has grown beyond its limit up to 80% of the
total JVM heap instead of the 30% configured.

At one time a GC longer that 3x30s timeout make the node leave the cluster

[2013-10-22 07:16:40,459][INFO ][discovery.zen ] [sissor1]
master_left
[[sissor2][sBQ1oCTbRsGexVcQpu466Q][inet[/192.168.110.90:9300]]], reason
[failed to ping, tried [3] times, each with maximum [30s] timeout]

The settings :

$ curl -XGET xxxxxx/_cluster/settings?pretty=true
{
"persistent" : { },
"transient" : {
"indices.cache.filter.size" : "30%"
}
}

This is a two nodes cluster with

  • Elasticsearch version 0.90.3,
  • 256 Go RAM
  • 127.8 Go heap
  • index total size: 956gb
  • 150 000 000 docs
  • 14 indices, 28 shards + 1 replica

I know this is an unusual heap size, but until now I had not so much
problems.

Could it be a bug or did i miss somethings ?

Benoît

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Nobody has an opinion?

On Wednesday, October 23, 2013 11:13:57 AM UTC+2, Benoît wrote:

Hello,

Yesterday I had the same problem as last time (
https://groups.google.com/d/msg/elasticsearch/PQhJezsaQrg/w8_KmhJ0wcQJ).

But this time I have a suspicion about indices.cache.filter.size limit
that might not be enforced.

The story is that filter_cache has grown beyond its limit up to 80% of
the total JVM heap instead of the 30% configured.

At one time a GC longer that 3x30s timeout make the node leave the cluster

[2013-10-22 07:16:40,459][INFO ][discovery.zen ] [sissor1]
master_left
[[sissor2][sBQ1oCTbRsGexVcQpu466Q][inet[/192.168.110.90:9300]]], reason
[failed to ping, tried [3] times, each with maximum [30s] timeout]

The settings :

$ curl -XGET xxxxxx/_cluster/settings?pretty=true
{
"persistent" : { },
"transient" : {
"indices.cache.filter.size" : "30%"
}
}

This is a two nodes cluster with

  • Elasticsearch version 0.90.3,
  • 256 Go RAM
  • 127.8 Go heap
  • index total size: 956gb
  • 150 000 000 docs
  • 14 indices, 28 shards + 1 replica

I know this is an unusual heap size, but until now I had not so much
problems.

Could it be a bug or did i miss somethings ?

Benoît

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I've had no issues with changing the settings.

Is 256 Go/127.8 Go a typo for GB? You might be better off running two
instances on a single node, since giving Java more than 32GB of RAM is
detrimental.

Ivan

On Thu, Oct 24, 2013 at 12:33 AM, Benoît benoit.intrw@gmail.com wrote:

Nobody has an opinion?

On Wednesday, October 23, 2013 11:13:57 AM UTC+2, Benoît wrote:

Hello,

Yesterday I had the same problem as last time (
https://groups.google.com/d/**msg/elasticsearch/PQhJezsaQrg/**
w8_KmhJ0wcQJhttps://groups.google.com/d/msg/elasticsearch/PQhJezsaQrg/w8_KmhJ0wcQJ
).

But this time I have a suspicion about indices.cache.filter.size limit
that might not be enforced.

The story is that filter_cache has grown beyond its limit up to 80% of
the total JVM heap instead of the 30% configured.

At one time a GC longer that 3x30s timeout make the node leave the cluster

[2013-10-22 07:16:40,459][INFO ][discovery.zen ] [sissor1]
master_left [[sissor2][sBQ1oCTbRsGexVcQpu466Q][inet[/
192.168.110.90:9300]]], reason [failed to ping, tried [3] times, each
with maximum [30s] timeout]

The settings :

$ curl -XGET xxxxxx/_cluster/settings?**pretty=true
{
"persistent" : { },
"transient" : {
"indices.cache.filter.size" : "30%"
}
}

This is a two nodes cluster with

  • Elasticsearch version 0.90.3,
  • 256 Go RAM
  • 127.8 Go heap
  • index total size: 956gb
  • 150 000 000 docs
  • 14 indices, 28 shards + 1 replica

I know this is an unusual heap size, but until now I had not so much
problems.

Could it be a bug or did i miss somethings ?

Benoît

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hello,

Thank you for looking at this, comments below.

On Thursday, October 24, 2013 8:57:21 PM UTC+2, Ivan Brusic wrote:

I've had no issues with changing the settings.

Is 256 Go/127.8 Go a typo for GB?

Oh yes sorry, o is for "Octet" which is the french of Byte :wink:

You might be better off running two instances on a single node,

I've never read support about running two instance of ES on one box.

since giving Java more than 32GB of RAM is detrimental.

I know that, but when we started this project, there was little room in the
rack and so we chose two large machines rather than many small.

The question is is it a bug or is it normal that indices.cache.filter.size
can go over the limit.

Benoît

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi

I've never seen the filter cache limit not being enforced. If you can
provide supporting data, ie the filter cache size from the nodes_stats plus
the settings you had in place at the time, would be helpful.

I support Ivan's comment about heap size: the bigger the heap, the longer
GC takes. And using a heap above 32GB means the JVM can't use compressed
pointers. So better to run multiple nodes on one machine, using "shard
awareness" to ensure that you don't have copies of the same data on the
same machine.

Clint

On 25 October 2013 10:05, Benoît benoit.intrw@gmail.com wrote:

Hello,

Thank you for looking at this, comments below.

On Thursday, October 24, 2013 8:57:21 PM UTC+2, Ivan Brusic wrote:

I've had no issues with changing the settings.

Is 256 Go/127.8 Go a typo for GB?

Oh yes sorry, o is for "Octet" which is the french of Byte :wink:

You might be better off running two instances on a single node,

I've never read support about running two instance of ES on one box.

since giving Java more than 32GB of RAM is detrimental.

I know that, but when we started this project, there was little room in
the rack and so we chose two large machines rather than many small.

The question is is it a bug or is it normal that indices.cache.filter.size
can go over the limit.

Benoît

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi !

On Friday, October 25, 2013 2:06:58 PM UTC+2, Clinton Gormley wrote:

I've never seen the filter cache limit not being enforced. If you can
provide supporting data, ie the filter cache size from the nodes_stats plus
the settings you had in place at the time, would be helpful.

Output of _cluster/settings and _nodes/stats?all=true in the following
gist : nodes stats and cluster setting · GitHub

The value is not really high right now but 44.5gb is over 30% of commited
heap (127.8gb)

"filter_cache": {
"memory_size": "44.5gb",
"memory_size_in_bytes": 47819287444,
"evictions": 0
},

I support Ivan's comment about heap size: the bigger the heap, the longer
GC takes. And using a heap above 32GB means the JVM can't use compressed
pointers. So better to run multiple nodes on one machine, using "shard
awareness" to ensure that you don't have copies of the same data on the
same machine.

ok i will think about it but the machine are in production ...

Regards

Benoît

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hello,

Has there been any updates to this? We are using nodes with 256GB of ram
and heap sizes of 96GB and also seeing this exact same issue where filter
cache sizes grow above the limit. What I also discovered was that when I
set the filter cache size to 31.9GB or lower the limit worked fine, but
anything above and it did not.

Thanks,
Daniel

On Friday, October 25, 2013 5:55:37 AM UTC-7, Benoît wrote:

Hi !

On Friday, October 25, 2013 2:06:58 PM UTC+2, Clinton Gormley wrote:

I've never seen the filter cache limit not being enforced. If you can
provide supporting data, ie the filter cache size from the nodes_stats plus
the settings you had in place at the time, would be helpful.

Output of _cluster/settings and _nodes/stats?all=true in the following
gist : nodes stats and cluster setting · GitHub

The value is not really high right now but 44.5gb is over 30% of commited
heap (127.8gb)

"filter_cache": {
"memory_size": "44.5gb",
"memory_size_in_bytes": 47819287444,
"evictions": 0
},

I support Ivan's comment about heap size: the bigger the heap, the longer
GC takes. And using a heap above 32GB means the JVM can't use compressed
pointers. So better to run multiple nodes on one machine, using "shard
awareness" to ensure that you don't have copies of the same data on the
same machine.

ok i will think about it but the machine are in production ...

Regards

Benoît

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f5158eda-7e43-4b05-9f32-52bd3984c766%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

For those who would come to this thread through a search engine, Dan found
the root cause of this issue

On Wed, May 21, 2014 at 8:03 PM, Daniel Low danglow@gmail.com wrote:

Hello,

Has there been any updates to this? We are using nodes with 256GB of ram
and heap sizes of 96GB and also seeing this exact same issue where filter
cache sizes grow above the limit. What I also discovered was that when I
set the filter cache size to 31.9GB or lower the limit worked fine, but
anything above and it did not.

Thanks,
Daniel

On Friday, October 25, 2013 5:55:37 AM UTC-7, Benoît wrote:

Hi !

On Friday, October 25, 2013 2:06:58 PM UTC+2, Clinton Gormley wrote:

I've never seen the filter cache limit not being enforced. If you can
provide supporting data, ie the filter cache size from the nodes_stats plus
the settings you had in place at the time, would be helpful.

Output of _cluster/settings and _nodes/stats?all=true in the following
gist : nodes stats and cluster setting · GitHub

The value is not really high right now but 44.5gb is over 30% of commited
heap (127.8gb)

"filter_cache": {
"memory_size": "44.5gb",
"memory_size_in_bytes": 47819287444,
"evictions": 0
},

I support Ivan's comment about heap size: the bigger the heap, the
longer GC takes. And using a heap above 32GB means the JVM can't use
compressed pointers. So better to run multiple nodes on one machine, using
"shard awareness" to ensure that you don't have copies of the same data on
the same machine.

ok i will think about it but the machine are in production ...

Regards

Benoît

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f5158eda-7e43-4b05-9f32-52bd3984c766%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/f5158eda-7e43-4b05-9f32-52bd3984c766%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4h9EcB6DoW0zSU2bpWCxoWm1_kkMt7vDYY0vqXLZif6g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.