Control over shard and replica allocation


(Berkay Mollamustafaoglu-2) #1

Hi,

Would it be feasible to control which nodes an index can be located along
the same lines how a River can be allocated to the subset of the nodes in
the cluster?

We have number of use cases that can take advantage of such a functionality.
For example, we'd like to be able to dedicate some of the nodes to certain
indices that are used by high volume of search operations and ensure that
search performance will not be impacted by heavy write operations of other
indices. Another use case is to prevent a large index from replicating
between data centers, etc.

One approach may be tagging the nodes and optionally specifying tags for
shards as well as replicas. Currently only way to do something like this
seems to be using multiple ES clusters.

I'll open an issue for this if it makes sense.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype


(Shay Banon) #2

Hi Berkay,

Yea, this is certainly make sense and its something I am thinking about.
Its tricky to design properly. Tagging of nodes is the first step, but then
there are many ways that users would want to control allocation. There is
index level, shard level, shard replication level (don't put a shard and its
replica on the same tag). You can open an issue for it, it is planned, just
need to find the best (aka simplest) way to expose it, implementation is
quite simple thanks to the current shard allocation design.

-shay.banon

On Sun, Oct 17, 2010 at 8:35 AM, Berkay Mollamustafaoglu
mberkay@gmail.comwrote:

Hi,

Would it be feasible to control which nodes an index can be located along
the same lines how a River can be allocated to the subset of the nodes in
the cluster?

We have number of use cases that can take advantage of such a
functionality. For example, we'd like to be able to dedicate some of the
nodes to certain indices that are used by high volume of search operations
and ensure that search performance will not be impacted by heavy write
operations of other indices. Another use case is to prevent a large index
from replicating between data centers, etc.

One approach may be tagging the nodes and optionally specifying tags for
shards as well as replicas. Currently only way to do something like this
seems to be using multiple ES clusters.

I'll open an issue for this if it makes sense.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype


(system) #3