Multiple DCs - How to prefer local data?

We examine an ES configuration with a single cluster across two DC. The latency between DCs is okay but not as good as inside the DCs.

I would like to achieve an active / passive deployment, so that:

  • The primary DC (A) will have all the primary shards and a single replica for each shard,
  • The secondary DC (B) will have two replicas for each shard.
  • If a machine in A fails, ES would prefer promoting the A replica to primary and not one of the B replicas.
  • If both the primary and replica on A fails, ES would promote one of the B replica's to primary but also create a new replica on A and when it's ready, make it primary again.
  • Queries and gets will prefer using local data on A (from primary and replica).

I thought of using forced awareness with 4 groups: A1, A2, B1, B2.
However, in case of failure in A1, how can I tell ES to specifically promote the replica in A2 ?

Update: thought of another possible solution: use shard allocation awareness (not forced) with two groups A, B and 3 replicas. I hope that if the primary shard on A fails, ES will prefer promoting the secondary shard on A and not one of the B's (Can someone confirm that?). Also in this solution I need application monitoring of cluster health. When the application (on the active DC) discovers that the other DC is down, it shall temporarily decrease replica count from 3 to 1, to avoid creating two additional local replicas in A.


Pro tip, don't do this! ES does not like latency and you will end up regretting it when the link goes down and your replication breaks, or you get a spike in latency and your nodes "drop" out because of this. There's a bunch of other threads on this topic so I'd encourage you to look at those.

You can use preferences for this sort of thing.