I am indexing at a decent rate.
20 indices each with 10000 fields and 50000 documents continuously indexing through 9 threads.
I have a cluster with a dedicated master node and two data nodes. Each node has 16GB RAM and 8GB Heap size.
No issues were found when only indexing the above mentioned scenario although it was at peak(let's say 6.5 to 7.5 GB). JVM Heap crossed the upper limit when more indexing and search requests were performed. Cluster went to Red Status and OOM error was thrown in logs.
My doubts are :
What contributes to JVM Heap? As I have both text and keyword for a single field. fielddata stays in in-memory but is not enabled by default and I have not changed this behaviour. Stored_fields also contributes to JVM Heap.
I have attached Kibana Screenshot at the time of indexing (partly).
What measures can be taken to bring down Heap-Size or rather prevent Heap-Size to reach its maximum?
3.Even though I restarted my cluster, JVM Heap didn't drop down after giving it some time. What can be the causes of this?
Thanks David for the response.
Unfortunately, it's not an typo. It is common case for my users to have 5000-7000 fields per index.
I have configured 2 primary shards and 1 replica per index.
Aim to keep the average shard size between at least a few GB and a few tens of GB. For use-cases with time-based data, it is common to see shards between 20GB and 40GB in size.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.