Hi,
In ECE template configuration page, we can create dedicated master type instance.
For sizing part, we can set memory-to-disk ratio.
If we set a high ratio like 32, ece will allocate 32GB disk per 1GB memory.
So if user choose 4GB for dedicated master , ece will allocate 128GB disk for this instance.
As we know, master node mainly uses memory and doesn't require high disk size. 20GB should be enough for one master node no matter how much its memory size is. 128GB is too much and may be a waste.
So my question is how can we limit master type instance to specified disk size like 20GB.
You are correct, master nodes do not require almost any storage, that is why the default ratio for master node instance configuration we provide OOTB is 1:4. Which is good for smaller master nodes. If you know you have large clusters and need master nodes with 8-16GB of RAM, you can configure an additional instance configuration for master node with 1:2 ratio which use 16-32GB of storage. You can then use these in the relevant templates.
There is currently no way to set a fix disk size in the instance configuration. There is a valid need for it not only for master nodes, but other node types which rely on RAM and / or CPU and do not need to increase the disk size in a linear relation to RAM. I'll make sure to open an internal enhancement request.
Thanks for opening this issue and hope my input helps.
Does ece check disk size available when it try to allocate one es node?
For example, one ece allocator is 32GB memory but only 256GB disk.
If I allocate a default 64GB es instance, this instance should consume 64*16GB = 1024 GB disk.
If ece check available disk size, this should quit immediately.
But I used to create these kind of instance successfully if my memory is right.
If ece doesn't check disk size, it will be ok no matter how much master node occupy.
I have ECE 2.1 and when I override fs_multiplier in the advanced config, it doesn't verify an available disk size. I was able to create a node having 150 TB on a server with only 3 TB available
You are correct. We currently only consider RAM when allocating a node to a host, and do not validate enough storage is available to satisfy the amount of storage that could potentially be used by a node when it's full.
We have an internal issue to discuss adding that as another resource to consider when deploying a node.
Today users need to ensure that the hosts will have enough disk space. This has been made easier with allocator tags and filters in instance configurations, introduced in ECE 2.0, that can help match instance type to specific set of allocators, making binpacking easier. But still not in the same way we calculate and consider RAM as a required resource.
I don' think this should always be a blocker though.
There shouldn't be an issue with users over provisioning on storage because clusters won't always approach their full size quickly and storage can always be added to the file system as the allocated deployments grow. You just need to be able to monitor and plan ahead.
Fully agree. This can be flag that will indicate desired behaviour, on the platform or specific allocator level, that can determine if the behaviour should be block or warn (default).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.