How does Elasticsearch handle nodes that do not have the same amount of
disk space on each node. Looking at the example below does ES limit
storage to that of the smaller one or does it make use of all the space
just allocating the shards accordingly?
Or is it even smart enough to take disk space into consideration at all?
How does Elasticsearch handle nodes that do not have the same amount of
disk space on each node. Looking at the example below does ES limit
storage to that of the smaller one or does it make use of all the space
just allocating the shards accordingly?
Or is it even smart enough to take disk space into consideration at all?
The disk threshold stuff has been available since sometime in 0.90. In 1.3
it'll be on by default. It works by stopping allocation to shards with
disks over a watermark and moving shards off nodes that are over another
higher mark. Meaning elasticsearch won't try to balance usage. Just keep
from filling up disk. It's a somewhat fine distinction but a real one.
On Jun 12, 2014 6:37 PM, "Mark Walkom" markw@campaignmonitor.com wrote:
How does Elasticsearch handle nodes that do not have the same amount of
disk space on each node. Looking at the example below does ES limit
storage to that of the smaller one or does it make use of all the space
just allocating the shards accordingly?
Or is it even smart enough to take disk space into consideration at all?
What's the ideal way to diagnose the initial problem stated?
I can see in our cluster we have a node with 43GB (of 500) free, yet most
others have around 100GB free (of 500). I can see the shard count per node
(using _cat/shards) seems to roughly the same, but could it be because some
of our shards are just different sizes?
In the case of the OP though, a 500GB difference seems difficult to explain.
The disk threshold stuff has been available since sometime in 0.90. In 1.3
it'll be on by default. It works by stopping allocation to shards with
disks over a watermark and moving shards off nodes that are over another
higher mark. Meaning elasticsearch won't try to balance usage. Just keep
from filling up disk. It's a somewhat fine distinction but a real one.
On Jun 12, 2014 6:37 PM, "Mark Walkom" markw@campaignmonitor.com wrote:
How does Elasticsearch handle nodes that do not have the same amount of
disk space on each node. Looking at the example below does ES limit
storage to that of the smaller one or does it make use of all the space
just allocating the shards accordingly?
Or is it even smart enough to take disk space into consideration at all?
I'm not sure what problem you mean. If you mean that you have uneven disk
utilization then I don't think there is something for that. You could raise
the index weight in the allocation weights. That'd spread the shards of
each index out more evenly. It might help.
On Jun 12, 2014 7:26 PM, "Mark Walkom" markw@campaignmonitor.com wrote:
What's the ideal way to diagnose the initial problem stated?
I can see in our cluster we have a node with 43GB (of 500) free, yet most
others have around 100GB free (of 500). I can see the shard count per node
(using _cat/shards) seems to roughly the same, but could it be because some
of our shards are just different sizes?
In the case of the OP though, a 500GB difference seems difficult to
explain.
The disk threshold stuff has been available since sometime in 0.90. In
1.3 it'll be on by default. It works by stopping allocation to shards with
disks over a watermark and moving shards off nodes that are over another
higher mark. Meaning elasticsearch won't try to balance usage. Just keep
from filling up disk. It's a somewhat fine distinction but a real one.
On Jun 12, 2014 6:37 PM, "Mark Walkom" markw@campaignmonitor.com wrote:
How does Elasticsearch handle nodes that do not have the same amount of
disk space on each node. Looking at the example below does ES limit
storage to that of the smaller one or does it make use of all the space
just allocating the shards accordingly?
Or is it even smart enough to take disk space into consideration at all?
OP here. My numbers on the disk space were not an actual observation of
current sizes. It was more of a hypothetical of what can I expect ES to do
if I only had three servers and that was the starting disk space available
in each.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.