Hello,
I would like to ask about shard allocation:
We have ES cluster of five machines which four of them are "twins" - two machines per one "sleeve" with single PSU unit.
I would like to set replica allocation that none of replicas is allocated to second "twin", so in case of PSU filure We woul not possibly loose whole part of index.
I do not want replica of shard2 be on Node1 because its Twin with one PSU for node2....
How to get it to work this way?
It is for logstash indexing
Than you
AM
(Option 1)
The simplest solution is to have 2 replicas. That way, if any single PSU fail, you will still have 1 shard (assuming node 5 is on a different PSU?)
Note that the API above will only change the number of replica for existing indices. To persist this setting to new indices, you will also need to change the settings in the templates.
(Option 2)
Otherwise, you can disable allocations on the cluster and only issue explicit shard allocations (this is a trade-off between control and convenience).
You can use shard allocation awareness to prevent that a primary and replica for any shard end up on nodes under the same PSU. Assign Node1 and Node 2 to 'rack1', Node3 and Node4 to 'rack2' and finally Node 5 to 'rack3'. The configure the shard allocation awareness to consider the rack id when placing shards. As long as you only have 1 replica configured, it should be possible to distribute the data fairly evenly even though one off the racks only has a single node. The uneven balance between racks can however become a problem if a PSU fails and takes down 2 nodes as recovery then will try to move a copy of all data over to Node5, which could cause problems.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.