Rack awareness in Elastic Search

Eran_Kutner · July 18, 2011, 3:17pm

Hi,
As mentioned in my previous post there are a few critical missing
features in Elastic Search I'd like to address.
The second one, is what is known in Hadoop and Cassandra as "Rack
awareness". Basically it is the ability to group nodes into distinct
groups which are supposed to be physically redundant and have to
system consider this when allocating shards.
For example let's say we have 5 physical servers connected to UPS A
and another 5 connected to UPS B. If one of the UPSs fails it will
take down all the servers connected to it but the other 5 servers will
continue running as usual. If we have 1 replica for each shard,
theoretically the system should be OK and continue to operate normally
under the above situation, but for this to happen we need to guarantee
that the two replicas of a shard are never hosted on servers connected
to the same UPS.
If there was a way to assign each server to a named group such as
"UPS_A" and "UPS_B" and make sure the shard replication process always
preferred to distribute replicas on as many groups as possible that
would give the desired behavior.

Again, would appreciate any feedback.

-eran

kimchy · July 18, 2011, 5:33pm

Yea, this is not implemented yet but high on the road map.

On Mon, Jul 18, 2011 at 6:17 PM, Eran Kutner eran@gigya.com wrote:

Hi,
As mentioned in my previous post there are a few critical missing
features in Elastic Search I'd like to address.
The second one, is what is known in Hadoop and Cassandra as "Rack
awareness". Basically it is the ability to group nodes into distinct
groups which are supposed to be physically redundant and have to
system consider this when allocating shards.
For example let's say we have 5 physical servers connected to UPS A
and another 5 connected to UPS B. If one of the UPSs fails it will
take down all the servers connected to it but the other 5 servers will
continue running as usual. If we have 1 replica for each shard,
theoretically the system should be OK and continue to operate normally
under the above situation, but for this to happen we need to guarantee
that the two replicas of a shard are never hosted on servers connected
to the same UPS.
If there was a way to assign each server to a named group such as
"UPS_A" and "UPS_B" and make sure the shard replication process always
preferred to distribute replicas on as many groups as possible that
would give the desired behavior.

Again, would appreciate any feedback.

-eran

Topic		Replies	Views
Forced-awareness in Elastic 8.0.1 Elasticsearch	11	538	May 12, 2022
Just Pushed: Cluster Shard Allocation Awareness Elasticsearch	2	280	July 6, 2017
Multiple nodes on same machine : replicas? Elasticsearch	4	1033	July 6, 2017
Question shard allocation awareness? Elasticsearch	16	2763	November 14, 2019
Elasticsearch client only cluster Elasticsearch	18	497	July 6, 2017

Rack awareness in Elastic Search

Related topics