Hello. We have a two data node and three dedicated master node cluster on AWS Elasticsearch service.
Currently have about 3 billion documents (1.1TB of data) and data nodes were previously running 2 cpus and 15GB of memory each. We upgraded the data nodes to 4 cpus and 30 GB of memory each after noticing CPU load was nearly pegged at 98% constantly.
After the upgrade, the JVM memory pressure is MUCH MORE volatile with huge peaks and valleys. See the graph:
Is this expected? Does this indicate that the cluster still is under provisioned in terms of memory? Here is a graph of CPU usage showing the old cluster and new cluster performance: