You're right. It sounds crazy. But Lucene uses a lot the file system cache. If you don't leave enough free RAM for the filesystem cache, Lucene won't be able to use it.
So a common recommendation is to give only half of the RAM to the HEAP.
Applying this won't help to reduce memory pressure for sure.
I guess that as your index can not be fully loaded in HEAP (500 Gb vs 13 Gb) Lucene loads data from the filesystem very often. I guess that filesystem cache should play a role here for better response time.
That said, if you have SSD drives or if response time is not an issue for you, may be keeping HEAP at 13Gb is fine???
Just my 2 cents here.
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 22 novembre 2013 at 09:45:35, ivan babrou (ibobrik@gmail.com) a écrit:
Why should I decrease heap size if io saturation is low? And how can I
find out memory usage per lucene instance? I will decrease sharding,
but I'm just curious and haven't seen any mention in docs.
On 22 November 2013 12:33, David Pilato david@pilato.fr wrote:
Whoa! 13Gb on a 16Gb machine is too much. 8Gb is better.
90 * 5 = 450 shards = 450 Lucene instances on a single machine and JVM with
only 13 or 8 Gb RAM!
I think I would first decrease the number of shards to 1 per index.
I would consider to add more nodes as well and/or more memory if possible.
Auto closing index does not exist as opening an index comes with a cost,
this is probably something you would like to control.
I would probably manage that on a client level.
You could also consider add less expensive nodes to hold your hold data as
you probably have less requests on them and use
Elasticsearch Platform — Find real-time answers at scale | Elastic
to move your old indices to those nodes.
HTH
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 22 novembre 2013 at 09:07:41, Ivan Babrou (ibobrik@gmail.com) a écrit:
We have 2 machines with 16gb of ram and 13gb are given to jvm. We have 90
indices (5 shards, 1 replica), one for a day of data, 500gb of data on each
node. 20% of memory is for field cache and 5% is for filter cache.
The problem is that we have to shrink cache size again because of increased
memory usage over time. Cluster restart doesn't help. I guess that indices
require some memory, but apparently there is no way to find out how much
memory each shard is using that cannot be freed by GC. Now we have 10-20% of
cpu time wasted by GC and this is not what we'd like to see.
Is there a way to reduce or at least find out memory usage for
indices/shards? Ideally it would be cool if elasticsearch could "park" old
indices that are not used often, kind of automatically open/close for
indices.
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/1iVhVumjYjw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
Regards, Ian Babrou
http://bobrik.name http://twitter.com/ibobrik skype:i.babrou
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.