We have 10 data-nodes in ES cluster. Each have the following configuration ,
8 GB out of 16 GB
default indexing buffer size (10% of heap)
indices.store.throttle.type: merge
indices.store.throttle.max_bytes_per_sec: 50mb
index.cache.field.max_size: 1000mb
index.cache.field.expire: 15m
index.store.type: mmapfs
So far we are only doing indexing alone. No search queries are executed in
ES data-nodes. Our data-nodes heap usage crosses 5 GB, we took heap dump
and analysed.
In that we saw unreachable objects took nearly 3 GB (Kindly look at
attachment). The rest 2 GB is occupied by RobinEngine instances which is
acceptable due to BloomFilter used for primary key look up. My question are
,
Though we set mmap as store type what actually occupies in that 3 GB ?
among 3GB, byte[] took 750 MB what will we stored in byte[] ?
Hi Jörg , Thanks for your reply. Each data node has approximately 350-400
shards. We are creating daily indexes , thus indexing will happen only on
limited number of shards on each node. We are maintaining indexes for only
limited days say 15-20. Each time we saw few GB's of un-reachable objects
almost on all data nodes in their heap dumps. As i mentioned in the
following post our nodes are getting quitted due to long GC pauses.
If you use index.cache.field.expire, you are activating expiration control.
I don't know if this is very useful. Maybe this creates many additional
unreachable objects but I'm not sure.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.