How does ES balance memory usage?

I have 8 nodes in my cluster, but I noticed that the memory usage is
pretty un-even once all of my documents are indexed. That by itself
doesn't bother me, but I am getting worried because I noticed that as
I've been populating a new index, the node that is almost full for
some reason seems to want to attract a lot more documents to it! :slight_smile:
It's actually starting to garbage collect more often than the other
nodes (not too often, just twice a day).

My question is - when/if that node runs really low on memory will ES
start filling the other nodes? Is there some kind of threshold (95%
full) where ES will stop attempting to drop more documents on this
node?

Several of my other nodes are less than 25% full and could easily
handle more.

Heya,

The way elasticsearch balance across a cluster is based on shards. It strives to keep an even number of shards across the nodes in the cluster. Memory consumption wise its kindda difficult to judge in a VM based language with garbage collector, since things might be good, just more memory was cleared on a node compared to others.

Other balancing schemes (for example, trying to strive to keep an even number of docs across the cluster, or balanced storage) can be implemented in future versions.

-shay.banon
On Friday, February 18, 2011 at 3:11 AM, jalano wrote:

I have 8 nodes in my cluster, but I noticed that the memory usage is
pretty un-even once all of my documents are indexed. That by itself
doesn't bother me, but I am getting worried because I noticed that as
I've been populating a new index, the node that is almost full for
some reason seems to want to attract a lot more documents to it! :slight_smile:
It's actually starting to garbage collect more often than the other
nodes (not too often, just twice a day).

My question is - when/if that node runs really low on memory will ES
start filling the other nodes? Is there some kind of threshold (95%
full) where ES will stop attempting to drop more documents on this
node?

Several of my other nodes are less than 25% full and could easily
handle more.

Other balancing schemes (for example, trying to strive to keep an
even number of docs across the cluster, or balanced storage) can be
implemented in future versions.

Or just a simple weighting that can be set per server?

clint