Number of vCPUS vs Number of Shards

I have an on-premise server with Elasticsearch 6.8 that was created years ago, and at the time I only had one vCPU and now I have 4 vCPU on that server.

It happens that when I mapped the indexes I defined the code below, can this cause problems?

curl -XPUT 'localhost:9200/index_name?pretty' -H 'Content-Type: application/json' -d '
{
    "settings": {
        "number_of_shards": 1
    },
    "mappings": {
        ...
    }
}

As I have seen a sudden increase in the average server load (reaching 10) but I have not seen the CPU usage percentage go past 50% or a large server RAM usage (68% from 8GB).

I noticed that the Load Average started to increase at 11:35, and my server reported this in the log.
But I don't understand what this logs meant.

[2022-04-29T09:34:02,740][INFO ][o.e.m.j.JvmGcMonitorService] [main-node] [gc][252358] overhead, spent [332ms] collecting in the last [1s]

[2022-04-29T09:47:44,032][WARN ][o.e.m.j.JvmGcMonitorService] [main-node] [gc][253175] overhead, spent [1.5s] collecting in the last [1.9s]

[2022-04-29T10:05:12,456][INFO ][o.e.m.j.JvmGcMonitorService] [main-node] [gc][254221] overhead, spent [337ms] collecting in the last [1s]

[2022-04-29T11:35:39,346][INFO ][o.e.m.j.JvmGcMonitorService] [main-node] [gc][young][259633][21204] duration [805ms], collections [1]/[2.7s], total [805ms]/[10m], memory [1.6gb]->[1.4gb]/[3.9gb], all_pools {[young] [231.7mb]->[11.9mb]/[266.2mb]}{[survivor] [13.2mb]->[33.2mb]/[33.2mb]}{[old] [1.3gb]->[1.4gb]/[3.6gb]}

[2022-04-29T11:35:41,140][INFO ][o.e.m.j.JvmGcMonitorService] [main-node] [gc][259633] overhead, spent [805ms] collecting in the last [2.7s]

/etc/Elasticsearch/jvm.options

-Xms4g
-Xmx4g

Maybe I should increase the values? What percentage of RAM should the server have reserved for the Heap?

Elasticsearch 6.8 is EOL and no longer supported. Please upgrade ASAP.

(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns :elasticheart: )

While the number of CPUs can impact the performance, I don't think that there is a direct relation between the number of CPUs and number of shards.

The recomendations that elastic gives about the number of shards are based on the memory used by the elasticseach process and the size of the shards.

There is a recommendation to keep the shard sizes around tens of GB, something close to 40 or 50 GB per shard, there is also another recommendation to have a maximum of 20 shards per GB of memory in each node.

Since you have 4 GB of heap memory, you should have no more than 80 shards in that node.

It is a single-node cluster or do you have other nodes? How many indices and shards do you have in that node? How many memory do the server have? What is the disk type?

Do you run anything else on that machine or just Elasticsearch?

With 4 vCPUs and a load of 10 your server is overloaded, since you do not see any CPU or memory increase, the cause could be the I/O requests.

If you have too many shards or are using a slow disk, this can ben the issue, your node/cluster could be oversharded.

Hi @leandrojmp,

Do you run anything else on that machine or just Elasticsearch?

I only use Elasticsearch

It is a single-node cluster or do you have other nodes? How many indices and shards do you have in that node? How many memory do the server have? What is the disk type?

My server is a single-node, with 6 indexes and 6 shards (if necessary I can send the cluster status).

Server specifications are:

  • SSD
  • 4vCPU
  • 8GB of RAM

Status of Indice:

"indices" : {
    "count" : 6,
    "shards" : {
      "total" : 6,
      "primaries" : 6,
      "replication" : 0.0,
      "index" : {
        "shards" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "primaries" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "replication" : {
          "min" : 0.0,
          "max" : 0.0,
          "avg" : 0.0
        }
      }
    }
}

If you have too many shards or are using a slow disk, this can ben the issue, your node/cluster could be oversharded.

I use a server hosted on DigitalOcean, where CPUs are shared, I don't know if this could be a problem or if it's low RAM, as the problem usually occurs once a day, at least a couple of times a week.