Can we limit es master instance disk size to specified number no matter how much its memory size is?

Hi,
In ECE template configuration page, we can create dedicated master type instance.

For sizing part, we can set memory-to-disk ratio.

If we set a high ratio like 32, ece will allocate 32GB disk per 1GB memory.
So if user choose 4GB for dedicated master , ece will allocate 128GB disk for this instance.

As we know, master node mainly uses memory and doesn't require high disk size. 20GB should be enough for one master node no matter how much its memory size is. 128GB is too much and may be a waste.

So my question is how can we limit master type instance to specified disk size like 20GB.

Any tips?

thanks!

Hi @rockybean,

You are correct, master nodes do not require almost any storage, that is why the default ratio for master node instance configuration we provide OOTB is 1:4. Which is good for smaller master nodes. If you know you have large clusters and need master nodes with 8-16GB of RAM, you can configure an additional instance configuration for master node with 1:2 ratio which use 16-32GB of storage. You can then use these in the relevant templates.

There is currently no way to set a fix disk size in the instance configuration. There is a valid need for it not only for master nodes, but other node types which rely on RAM and / or CPU and do not need to increase the disk size in a linear relation to RAM. I'll make sure to open an internal enhancement request.

Thanks for opening this issue and hope my input helps.

Thanks!

I have a follow up question。

Does ece check disk size available when it try to allocate one es node?

For example, one ece allocator is 32GB memory but only 256GB disk.
If I allocate a default 64GB es instance, this instance should consume 64*16GB = 1024 GB disk.

If ece check available disk size, this should quit immediately.

But I used to create these kind of instance successfully if my memory is right.

If ece doesn't check disk size, it will be ok no matter how much master node occupy.

Am I right?

I'm using the latest version and when trying to over provision I see the expected error of: "There is not enough capacity..."

I create an instance with memory-to-disk ratio 1:200.

And then I try to create a deployment with 8GB Node like below.

For this 8GB Node , it will need 1.5TB disk. Actually, I don't get that big disk.

[root@a1-172-31-18-161 ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1      100G   13G   88G  13% /
devtmpfs         16G     0   16G   0% /dev
tmpfs            16G     0   16G   0% /dev/shm
tmpfs            16G   66M   16G   1% /run
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/xvdb      1000G   55G  945G   6% /mnt

However, the deployment goes normally.

Can you test this in your env?

I have ECE 2.1 and when I override fs_multiplier in the advanced config, it doesn't verify an available disk size. I was able to create a node having 150 TB on a server with only 3 TB available :slight_smile:

"overrides": {
    "quota": {
      "fs_multiplier": 50000
    }
  }
1 Like

Sorry, you are correct. The error displayed was due to RAM missing capacity, not disk space.

So there is no check for disk size, right?

How can we avoid this kind of problem in production env?

You are correct. We currently only consider RAM when allocating a node to a host, and do not validate enough storage is available to satisfy the amount of storage that could potentially be used by a node when it's full.

We have an internal issue to discuss adding that as another resource to consider when deploying a node.

Today users need to ensure that the hosts will have enough disk space. This has been made easier with allocator tags and filters in instance configurations, introduced in ECE 2.0, that can help match instance type to specific set of allocators, making binpacking easier. But still not in the same way we calculate and consider RAM as a required resource.

1 Like

OK
Thanks!

I don' think this should always be a blocker though.

There shouldn't be an issue with users over provisioning on storage because clusters won't always approach their full size quickly and storage can always be added to the file system as the allocated deployments grow. You just need to be able to monitor and plan ahead.

Fully agree. This can be flag that will indicate desired behaviour, on the platform or specific allocator level, that can determine if the behaviour should be block or warn (default).

As I know, it's not always easy to expand storage size without restart ece node.

Am I right?

Or can u give me any tips on how to do this in production env?

This is correct I think. The outline plan for host maintenance (which this would fall under) is documented here: https://www.elastic.co/guide/en/cloud-enterprise/current/ece-perform-host-maintenance.html

1 Like

Thanks!

Does ECE Team have plan to add disk check procedure in the future?

Restart ece node is actually an expensive operation and we do not want to do that in production env.

Instead of restarting ece node, we prefer that the over provision of disk doesn't happen.

:joy:

Yes, we do have an internal issue to add this check as well. There is currently no ETA we can share though.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.