How work rackIDs in ElasticSearch

Hello,

work the 'node.rack' config parameter, like the rackID by Hadoop?

For example: I have 3 replicas, some indizes and 4 rackIDs. Places
ElasticSearch a 'complete' index in every rack?

regards,
michael

--

  • Michael Rennecke *
    Junior Systemarchitekt, Semantic Web Project, IT

Unister Holding GmbH
Barfußgässchen 11 | 04109 Leipzig

Telefon: +49 (0)341 355381 25291
michael.rennecke@unister-gmbh.de mailto:michael.rennecke@unister-gmbh.de
www.unister.de http://www.unister.de/

Vertretungsberechtigter Geschäftsführer: Thomas Wagner
Amtsgericht Leipzig, HRB: 25007

--

Hello Michael,

Yes, the idea is to have complete sets of data deployed on nodes
having the same attribute. The attribute can have any name, like
"rack", "rack_id", "datacenter_name", etc.

Just setting the attribute won't do anything - you need to specify
what you want to do with that attribute. And there are two typical
scenarios:

  • you want to avoid having shards and replicas on a single group of
    nodes. For that, you can configure
    "cluster.routing.allocation.awareness.attributes:
    attribute_name_goes_here". Then, ES will try to allocate a complete
    set of data to nodes having one value of the attribute (say, "rack_id:
    one"), another set of data (replicas) to nodes having another value
    (say, "rack_id: two") and so on
  • you want to make sure shards and replicas don't end up on a single
    group of nodes. For example, you have 5 nodes in a rack and 5 nodes in
    another rack. And you have one index with 5 shards and a replica. With
    the configuration above you'd end up with a complete set of shards per
    rack. But if a whole rack goes down, ES will reallocate the replicas
    to the available nodes - so you'll have all 10 shards (5 primary + 5
    replicas) on your remaining 5 nodes. But if you add
    "cluster.routing.allocation.awareness.force.zone.values: one,two", if
    a rack goes down replicas shouldn't get allocated to the remaining
    rack.

Best regards,
Radu

http://sematext.com/ -- ElasticSearch -- Solr -- Lucene

On Tue, Nov 13, 2012 at 9:34 AM, Michael Rennecke
michael.rennecke@unister-gmbh.de wrote:

Hello,

work the 'node.rack' config parameter, like the rackID by Hadoop?

For example: I have 3 replicas, some indizes and 4 rackIDs. Places
ElasticSearch a 'complete' index in every rack?

regards,
michael

--

On Tue, Nov 13, 2012 at 11:01 AM, Radu Gheorghe
radu.gheorghe@sematext.com wrote:

Hello Michael,

Yes, the idea is to have complete sets of data deployed on nodes
having the same attribute. The attribute can have any name, like
"rack", "rack_id", "datacenter_name", etc.

Just setting the attribute won't do anything - you need to specify
what you want to do with that attribute. And there are two typical
scenarios:

  • you want to avoid having shards and replicas on a single group of
    nodes. For that, you can configure
    "cluster.routing.allocation.awareness.attributes:
    attribute_name_goes_here". Then, ES will try to allocate a complete
    set of data to nodes having one value of the attribute (say, "rack_id:
    one"), another set of data (replicas) to nodes having another value
    (say, "rack_id: two") and so on
  • you want to make sure shards and replicas don't end up on a single
    group of nodes. For example, you have 5 nodes in a rack and 5 nodes in
    another rack. And you have one index with 5 shards and a replica. With
    the configuration above you'd end up with a complete set of shards per
    rack. But if a whole rack goes down, ES will reallocate the replicas
    to the available nodes - so you'll have all 10 shards (5 primary + 5
    replicas) on your remaining 5 nodes. But if you add
    "cluster.routing.allocation.awareness.force.zone.values: one,two"

copy-paste error :(. If your attribute name is "rack_id", then the
config would be:

cluster.routing.allocation.awareness.force.rack_id.values: one,two

, if
a rack goes down replicas shouldn't get allocated to the remaining
rack.

Best regards,
Radu

http://sematext.com/ -- ElasticSearch -- Solr -- Lucene

On Tue, Nov 13, 2012 at 9:34 AM, Michael Rennecke
michael.rennecke@unister-gmbh.de wrote:

Hello,

work the 'node.rack' config parameter, like the rackID by Hadoop?

For example: I have 3 replicas, some indizes and 4 rackIDs. Places
ElasticSearch a 'complete' index in every rack?

regards,
michael

--

Hallo,

thx Radu,

kind regards,
michael

Am 13.11.2012 10:03, schrieb Radu Gheorghe:

On Tue, Nov 13, 2012 at 11:01 AM, Radu Gheorghe
radu.gheorghe@sematext.com wrote:

Hello Michael,

Yes, the idea is to have complete sets of data deployed on nodes
having the same attribute. The attribute can have any name, like
"rack", "rack_id", "datacenter_name", etc.

Just setting the attribute won't do anything - you need to specify
what you want to do with that attribute. And there are two typical
scenarios:

  • you want to avoid having shards and replicas on a single group of
    nodes. For that, you can configure
    "cluster.routing.allocation.awareness.attributes:
    attribute_name_goes_here". Then, ES will try to allocate a complete
    set of data to nodes having one value of the attribute (say, "rack_id:
    one"), another set of data (replicas) to nodes having another value
    (say, "rack_id: two") and so on
  • you want to make sure shards and replicas don't end up on a single
    group of nodes. For example, you have 5 nodes in a rack and 5 nodes in
    another rack. And you have one index with 5 shards and a replica. With
    the configuration above you'd end up with a complete set of shards per
    rack. But if a whole rack goes down, ES will reallocate the replicas
    to the available nodes - so you'll have all 10 shards (5 primary + 5
    replicas) on your remaining 5 nodes. But if you add
    "cluster.routing.allocation.awareness.force.zone.values: one,two"

copy-paste error :(. If your attribute name is "rack_id", then the
config would be:

cluster.routing.allocation.awareness.force.rack_id.values: one,two

, if
a rack goes down replicas shouldn't get allocated to the remaining
rack.

Best regards,
Radu

http://sematext.com/ -- ElasticSearch -- Solr -- Lucene

On Tue, Nov 13, 2012 at 9:34 AM, Michael Rennecke
michael.rennecke@unister-gmbh.de wrote:

Hello,

work the 'node.rack' config parameter, like the rackID by Hadoop?

For example: I have 3 replicas, some indizes and 4 rackIDs. Places
ElasticSearch a 'complete' index in every rack?

regards,
michael

--

  • Michael Rennecke *
    Junior Systemarchitekt, Semantic Web Project, IT

Unister Holding GmbH
Barfußgässchen 11 | 04109 Leipzig

Telefon: +49 (0)341 355381 25291
michael.rennecke@unister-gmbh.de mailto:michael.rennecke@unister-gmbh.de
www.unister.de http://www.unister.de/

Vertretungsberechtigter Geschäftsführer: Thomas Wagner
Amtsgericht Leipzig, HRB: 25007

--