Should I increase my amount of RAM?

I noticed that the RAM usage percentage of my nodes is quite high.

Based on these logs, should I increase the amount of RAM on my servers?

The master is usually the master-01 machine, but in the log it is as master-02, which could mean that the others may have gone down.

ip          heap.percent ram.percent cpu heapPercent heapMax heapCurrent version  jdk    diskTotal diskUsedPercent ramMax load_1m load_5m load_15m node.role   master name
XX.XX.XX.XX          37          97   3          37   7.8gb       2.8gb   8.10.1  20.0.2   232.4gb            9.56 15.6gb    0.32    0.32     0.23 cdfhilmrstw *      master-02
XX.XX.XX.XX          89          98  10          89   7.8gb       6.9gb   8.10.1  20.0.2   232.4gb           45.83 15.6gb    0.30    0.30     0.27 cdfhilmrstw -      master-01
XX.XX.XX.XX          84          98   9          84   7.8gb       6.6gb   8.10.1  20.0.2   232.4gb           51.39 15.6gb    0.30    0.34     0.39 cdfhilmrstw -      master-03

Hi,

Given this information, it would be advisable to increase the amount of RAM on your servers, or to investigate further to see if there are any memory leaks or other issues causing the high memory usage. You could also consider adding more nodes to your cluster to distribute the load more evenly.

Regards

How can I check if there is a memory leak?

Hi, @louisdeveloper, Those numbers do not really indicate much without more context.

As I am sure you know, Elasticsearch is based on the JVM, and the Garbage Collection is executed when necessary. RAM percent will go up and down over time.

Unless you are seeing issues you may not need to do anything. Do you see issues, circuit breakers, failed queries, etc?

If you want to understand more, the first thing to do is set up monitoring on the cluster.

How to do that is in the docs... make sure you look at the correct versions

Hi, @stephenb, with a certain frequency I have seen errors in the log mentioning "circuit break" problems.

Like this message today:

mycustom-elk-cluster.log:Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [internal:cluster/nodes/indices/shard/store[n]] would be [8286921500/7.7gb], which is larger than the limit of [7961208422/7.4gb], real usage: [8286921376/7.7gb], new bytes reserved: [124/124b], usages [model_inference=0/0b, eql_sequence=0/0b, fielddata=0/0b, request=0/0b, inflight_requests=124/124b]
mycustom-elk-cluster.log:    at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:414) ~[elasticsearch-8.10.1.jar:?]
mycustom-elk-cluster.log:    at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:109) ~[elasticsearch-8.10.1.jar:?]

I have no idea that very large data could be being sent.
What is indexed for our database is no more than 100mb.
Having more than 1GB of information in an item would be too much.

Or would it be data sent between clusters?

Ashh OK so you do have errors

First I would look at this

and this to understand where your JVM usage is coming from

and yes you may need to increase JVM but you may have other issues....

Hi @stephenb ,thank you for your help.

I ran the JVM pressure check at a normal time and they seem normal.
The master even has a very low JVM pressure.

I ran the JVM Pressure view and node status view at the same time.

RAM is always high, but JVM seems quiet.

Am I really thinking that my problem is just low RAM? What do you think?

heap.percent ram.percent cpu heapPercent heapMax heapCurrent version jdk    diskTotal diskUsedPercent ramMax load_1m load_5m load_15m node.role   master name
          83          98  25          83   7.8gb       6.4gb 8.10.1  20.0.2   232.4gb           51.67 15.6gb    1.34    1.44     1.21 cdfhilmrstw -      master-03
          66          98  28          66   7.8gb       5.2gb 8.10.1  20.0.2   232.4gb           47.04 15.6gb    1.52    1.49     1.22 cdfhilmrstw -      master-01
          38          98   9          38   7.8gb         3gb 8.10.1  20.0.2   232.4gb            9.53 15.6gb    0.42    0.21     0.14 cdfhilmrstw *      master-02
{
    "nodes": {
        "xxxxxxxxxxxx": {
            "jvm": {
                "mem": {
                    "pools": {
                        "old": {
                            "used_in_bytes": 4798938440,
                            "max_in_bytes": 8380219392,
                            "peak_used_in_bytes": 8322237672,
                            "peak_max_in_bytes": 8380219392
                        }
                    }
                }
            }
        },
        "xxxxxxxxxxxx": {
            "jvm": {
                "mem": {
                    "pools": {
                        "old": {
                            "used_in_bytes": 1055057928,
                            "max_in_bytes": 8380219392,
                            "peak_used_in_bytes": 1702827008,
                            "peak_max_in_bytes": 8380219392
                        }
                    }
                }
            }
        },
        "xxxxxxxxxxxx": {
            "jvm": {
                "mem": {
                    "pools": {
                        "old": {
                            "used_in_bytes": 5363155040,
                            "max_in_bytes": 8380219392,
                            "peak_used_in_bytes": 8334093944,
                            "peak_max_in_bytes": 8380219392
                        }
                    }
                }
            }
        }
    }
}

Of course, from time to time a spike in seconds appears that rises to 89%, but then drops.

But even on the machine with little JVM the RAM usage percentage is 98%.

Unfortunately I couldn't find how to count how many shards there are per index.

I ran the cURL below but only got the following result, without shard information.

curl -X GET "http://XXX.XXX.XXX.XXX:9200/_cat/indices/myindex?format=json

[
    {
        "health": "green",
        "status": "open",
        "index": "myindex",
        "uuid": "XXXXXXXX",
        "pri": "1",
        "rep": "1",
        "docs.count": "264525828",
        "docs.deleted": "37960517",
        "store.size": "195.2gb",
        "pri.store.size": "97.1gb"
    }
]

Hi @stephenb, Thank you for your help.

I ran the JVM Pressure view and node status view at the same time.

RAM is always high, but JVM seems quiet. The master even has a very low JVM pressure.

I'm really thinking that my problem is just low RAM. What do you think?

{
    "nodes": {
        "XXXXXXXXX": {
            "jvm": {
                "mem": {
                    "pools": {
                        "old": {
                            "used_in_bytes": 4798938440,
                            "max_in_bytes": 8380219392,
                            "peak_used_in_bytes": 8322237672,
                            "peak_max_in_bytes": 8380219392
                        }
                    }
                }
            }
        },
        "XXXXXXXXX": {
            "jvm": {
                "mem": {
                    "pools": {
                        "old": {
                            "used_in_bytes": 1055057928,
                            "max_in_bytes": 8380219392,
                            "peak_used_in_bytes": 1702827008,
                            "peak_max_in_bytes": 8380219392
                        }
                    }
                }
            }
        },
        "XXXXXXXXX: {
            "jvm": {
                "mem": {
                    "pools": {
                        "old": {
                            "used_in_bytes": 5363155040,
                            "max_in_bytes": 8380219392,
                            "peak_used_in_bytes": 8334093944,
                            "peak_max_in_bytes": 8380219392
                        }
                    }
                }
            }
        }
    }
}
heap.percent ram.percent cpu heapPercent heapMax heapCurrent version jdk    diskTotal diskUsedPercent ramMax load_1m load_5m load_15m node.role   master name
          83          98  25          83   7.8gb       6.4gb 8.10.1  20.0.2   232.4gb           51.67 15.6gb    1.34    1.44     1.21 cdfhilmrstw -      master-03
          66          98  28          66   7.8gb       5.2gb 8.10.1  20.0.2   232.4gb           47.04 15.6gb    1.52    1.49     1.22 cdfhilmrstw -      master-01
          38          98   9          38   7.8gb         3gb 8.10.1  20.0.2   232.4gb            9.53 15.6gb    0.42    0.21     0.14 cdfhilmrstw *      master-02

Of course, from time to time a spike in seconds appears that rises to 89%, but then drops. But even on the machine with little JVM the RAM usage percentage is 98%

Unfortunately I couldn't find how to count how many shards there are per index.

I ran the cURL below but only got the following result, without shard information.

curl -X GET "localhost:9200/_cat/indices/myindex?v"

[
    {
        "health": "green",
        "status": "open",
        "index": "myindex",
        "uuid": "XXXXXXXXXX",
        "pri": "1",
        "rep": "1",
        "docs.count": "264525828",
        "docs.deleted": "37960517",
        "store.size": "195.2gb",
        "pri.store.size": "97.1gb"
    }
]

The fact that memory usage is close to 100% is not necessarily a sign that you need more RAM and it is also not a problem. It is perfectly normal for an Elasticsearch node to show close to 100% of RAM in use if it holds a good amount of data. The reason for this is that a lot of it is used by the operating system page cache, which is critical for optimal Elasticsearch performance. If some process requires memory, the operatng system will release some of the memory used for the page cache so this does not cause problems or failures.

Hi @Christian_Dahlqvist , In my application I have a reporting feature that uses a lot of queries with scrolls, but I try to limit them to 5 thousand records in total.

Could this be something that gets a lot of RAM? On my old machine I had 24Gb of RAM, but it was Elasticsearch version 6 and it was a single node, with no backup nodes.

I checked today's log and there are a lot of circuitBreak records on non-master servers.