Heap Usage in ES Nodes

ananth · November 29, 2013, 6:28am

We have 10 data-nodes in ES cluster. Each have the following configuration ,

8 GB out of 16 GB
default indexing buffer size (10% of heap)

indices.store.throttle.type: merge

indices.store.throttle.max_bytes_per_sec: 50mb

index.cache.field.max_size: 1000mb

index.cache.field.expire: 15m

index.store.type: mmapfs

So far we are only doing indexing alone. No search queries are executed in
ES data-nodes. Our data-nodes heap usage crosses 5 GB, we took heap dump
and analysed.
In that we saw unreachable objects took nearly 3 GB (Kindly look at
attachment). The rest 2 GB is occupied by RobinEngine instances which is
acceptable due to BloomFilter used for primary key look up. My question are
,

Though we set mmap as store type what actually occupies in that 3 GB ?
among 3GB, byte[] took 750 MB what will we stored in byte[] ?
Whether mmap will used while segments merges ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c5e8360-69cd-4951-8fcb-de7ba727843b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

ananth · December 4, 2013, 12:03pm

can anyone has any idea on this ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c160d4c0-a015-4d99-8b32-735bdde9e327%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · December 4, 2013, 12:48pm

mmap is not related to objects on the JVM heap. It has also not much to do
with segment merge.

Think of mmap as an efficient method to address file reads, outside of the
scope of the JVM. It is related to virtual memory organization of the OS.

byte[] is holding "raw" data of the Lucene index, when segments got to be
processed.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoED%3DBHjorBhVXFzyS9uJRRRWAGUSNgDQSHT5w7p4GUWag%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

ananth · December 9, 2013, 5:18am

Hi Jörg , Thanks for your reply. Each data node has approximately 350-400
shards. We are creating daily indexes , thus indexing will happen only on
limited number of shards on each node. We are maintaining indexes for only
limited days say 15-20. Each time we saw few GB's of un-reachable objects
almost on all data nodes in their heap dumps. As i mentioned in the
following post our nodes are getting quitted due to long GC pauses.

https://groups.google.com/forum/#!searchin/elasticsearch/Anantha/elasticsearch/X6aIjk8wCQw/LT0BE_ipmMYJ

Jörg, any suggestions will help us to resolve the memory problem. I think
it might be due to segment merges, correct me if i am wrong !

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/74ef5c54-9129-4e97-92a8-8543669fe0b3%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · December 9, 2013, 8:42am

If you use index.cache.field.expire, you are activating expiration control.
I don't know if this is very useful. Maybe this creates many additional
unreachable objects but I'm not sure.