How do I check what causing high JVM?

qub123 · February 12, 2021, 11:32am

Hi,

We have been having some heavy load on our system, but I am not exactly sure what causing them. Just wondering is there a way to check what have caused the high JVM load in our system. I know the high JVM load can be cause by high number of shards, number of requests , queries etc. But without a tangible figures, it will be hard to understand what exactly causing the high load.

We have 5 data nodes, and each nodes with 64GB RAM, and 12 cores CPU.
And we have 114 indices, and 988 shards spread across 5 data nodes.

We are thinking to add 2 more data nodes, and then reallocate some of the indices into those nodes.

alejandrosl · February 12, 2021, 11:46am

Hi there

Have JVM memory pressure above 75% is not a problem in itself, and often is cause by having too many shards (each shards [lucene index] allocate resources like memory).

Always you can add more nodes, but maybe the first question is why you need this amount of shards? maybe you need to request many queries and need parallelism?

Another principal origin of this heavy use of memory are expensive queries
You can check too how is the garbage collector behavior

Let me share with you a link to a elastic entry bog about to understand the memory pressure https://www.elastic.co/es/blog/found-understanding-memory-pressure-indicator/

and ... almost people know the limit of the JVM memory to elasticsearch is 32Gb (as you have) but on my personally experience I don config more than 30 because garbage collector is pretty playful (its my own think)

regards
Ale

warkolm · February 14, 2021, 11:09pm

I would agree with Alejandro here. Your average shard size is pretty small, I would look to increase that to reduce some of your heap use.

qub123 · February 15, 2021, 8:31am

Hi,

Thanks for all the replies!
I noticed the number of shards for each index is a little bit high, just wondering how many shards is the best for our indices. I have snapshot the top 25 largest indices we have, the biggest index we have is 53m and 170.4G of documents, 2nd largest is 38.7m, and third largest is 13,8m The top 20 indices are more than 100k documents. And we are aiming to continuously break down the big index to an even smaller one, but I have some doubt, what is the reasonable size for each index. Please see my questions below..

What is the reasonable size for each index to have good performance?
How many shards are good for each index , for example some of the indices are over one 10 million, some are over 1 million, some are over 100k?
How do I check where the replica shard is located, if I set my index only to have only one shard each, if the node that holds that primary index is down, will the replica be available?

/Kenneth

warkolm · February 15, 2021, 10:35pm

20-50GB is a good shard size.

system · March 15, 2021, 10:36pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Relation between JVM Usage and multi node in Elasticsearch Elasticsearch	2	28	December 30, 2024
Causes of High JVM in Data and Master nodes Elasticsearch	7	2003	August 14, 2019
JVM Pressure on Cluster Nodes with too many indices Elasticsearch elastic-stack-monitoring	4	382	August 24, 2022
JVM Consumption continuously increasing Elasticsearch	5	329	April 15, 2021
JVM > 90% - Small indexes , High Shards Elasticsearch	6	957	July 5, 2017

How do I check what causing high JVM?

Related topics