How can I modify the distribution of the shards in nodes with diffent disk capacity?

xyz_hh · July 5, 2019, 3:31am

I have a several of servers with diffent disk capacity in my elasticsearch cluster.I find all the shards is allocated to all the server averagely.But what I want is : more disks ,more shards.When a node have less disks，it should have less shards. How can I modify the distribution of the shards.
Thanks!

warkolm · July 5, 2019, 7:47am

I don't believe you can do this.

DavidTurner · July 5, 2019, 8:41am

Sorta. Disk-based shard allocation prevents nodes from running out of disk space, which is normally what you want. This is enabled by default. Is that insufficient? Why?

xyz_hh · July 5, 2019, 8:50am

I knew it will be not out of disk space. But when the disk space is exhausted in one node,could it get new shard for writing load balance ？Or just store the old data for reading?

martinr_ubi · July 5, 2019, 8:53am

Maybe I'm reading the question differently but I think you can, based on what you said, which is a bit vague

https://www.elastic.co/guide/en/elasticsearch/reference/current/disk-allocator.html

With this you can control when a node will start to refuse new shards and/or move them away toward other nodes with more free space. So you can have less shards on nodes with less disk space if you simply let them fill up to the high watermark. At that point they will attempt to ride this high watermark. When stuff is deleted and they get free space they will receive shards and when they get back up to high watermark they will refuse new shards and/or push some away. You can adjust that value, but it applies across all nodes.
So if you continue to fill the cluster up, when the smaller nodes reach the high watermark they will plateau and the bigger nodes will continue to fill up, given you unequal or not averaged shards distribution, which what you want.

Don't want to wait? You could lower the watermarks to adjust the distribution if you have a constant/stable dataset in the cluster. Lowering it means pushing down on the space used on the smallest nodes, which will displace those shards toward bigger nodes.

Another approach is partioning the indices more statically, specializing some nodes. Indices A and B go on the 4 smaller nodes and indices C, D, E go on the bigger nodes. Depending on the indices sizes and shard count you allocate them to nodes and thus control the distribution.
This is :
https://www.elastic.co/guide/en/elasticsearch/reference/current/shard-allocation-filtering.html

I could think of other ways still to achieve it, but they become more and more operation heavy/adjustment needed or straight up requires that you make a shard orchestrator on the side and config the cluster to not fight it with the allocation and balancing parameters. I think the more complex the least useful it would be but I can't judge your reasons because you haven't given any. (Why do you want them to get shards proportionaly to their disk size? Let the smallest ones fill up before the other ones, whats wrong with this in your situation? You haven't said.)
The KISS principle applies, is it really needed, why, are there simpler alternatives?

Also they do only differ in disk size these heterogenous nodes? Do you know about hot warm cluster architectures? That could apply here if you turn those nodes types into hot warm cold node types which is also a knob to control placements of shards over nodes.

system · August 2, 2019, 8:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Shards not allocating based on disk space Elasticsearch	6	893	May 14, 2019
Total_shards_per_node and disk usage too high causes shards to stay unallocated Elasticsearch	7	21	December 23, 2024
Can es shards allocate to nodes based on disk capacity？ Elasticsearch	3	305	October 8, 2020
Shard allocation based on shard size Elasticsearch	14	947	January 18, 2021
Understanding Disk-based Shard Allocation better Elasticsearch	11	1206	March 11, 2019

How can I modify the distribution of the shards in nodes with diffent disk capacity?

Related topics