Multi-datacenter deployments

Hi everyone,

We're working on making one of our ES using applications run in multiple
data centers (active/passive for now).

The other data stores we have can replicate seamlessly and have some idea of
"local" vs "remote"; so we'd like to not run the ES instances as completely
separate.

A minimal-ish TODO list I could think of is:

  • Being able to discover nodes across networks (I think that works already
    with a bit of configuration, no?)
  • Have ES know to have at least one copy of each shard in each datacenter.
  • When copying shards on startup then pull it from a local node if possible.
  • When doing queries, prefer shards that are local.
  • Have the client prefer local servers.

Is any of this on the roadmap? I think we'd be interested in helping
sponsor this work if possible.

  • ask

Hi,

Smarter / more controllable shard allocation is on the roadmap, where you
could control shard allocation in a similar manner that you described, but
it will only work properly for DC that have a fast connection between them.
Otherwise, a different approach is needed, where a write ahead log based
replication will be needed between two different es clusters.

On Wed, Aug 3, 2011 at 1:33 AM, Ask Bjørn Hansen ask@develooper.com wrote:

Hi everyone,

We're working on making one of our ES using applications run in multiple
data centers (active/passive for now).

The other data stores we have can replicate seamlessly and have some idea
of "local" vs "remote"; so we'd like to not run the ES instances as
completely separate.

A minimal-ish TODO list I could think of is:

  • Being able to discover nodes across networks (I think that works already
    with a bit of configuration, no?)
  • Have ES know to have at least one copy of each shard in each datacenter.
  • When copying shards on startup then pull it from a local node if
    possible.
  • When doing queries, prefer shards that are local.
  • Have the client prefer local servers.

Is any of this on the roadmap? I think we'd be interested in helping
sponsor this work if possible.

  • ask

I am looking to do something very similar here with two datacenters that
are about 20ms apart. All of the indexing happens in one DC, so I would
prefer all the primary shards to be in one DC and all the replication to
happen across the VPN to the second DC. Otherwise I could be sending data
across the interconnect twice.

It sounds like right now the WAL replication functionality is not yet
implemented, so this sort of master-slave cluster replication is not (yet)
available. Is that correct?

I think you can do what you want with the cluster allocation API:

But it doesn't look very robust, e.g. you can specify that certain
nodes shouldn't have primary shards, but if a primary is missing a
replica (in your other DC) might still be promoted to the primary.

Maybe someone here has more experience with this.

On Mon, Jul 9, 2012 at 11:42 PM, Loren loren@siebert.org wrote:

I am looking to do something very similar here with two datacenters that are
about 20ms apart. All of the indexing happens in one DC, so I would prefer
all the primary shards to be in one DC and all the replication to happen
across the VPN to the second DC. Otherwise I could be sending data across
the interconnect twice.

It sounds like right now the WAL replication functionality is not yet
implemented, so this sort of master-slave cluster replication is not (yet)
available. Is that correct?

I did notice the cluster API and some of the other posts on this multi-datacenter topic.

As I just want the primary copies to be in the DC1 zone before the main indexing begins, I was thinking I could disable the nodes in the DC2 zone long enough for any replica DC1 nodes to be promoted to primary, and then re-enable DC2 so that they become replicas. Seems like there might be some sort of way to do this programmatically, but I am still learning about ES and am not sure what’s there, what’s not, and what’s in the development pipeline.

On Jul 9, 2012, at 3:24 PM, Ævar Arnfjörð Bjarmason wrote:

I think you can do what you want with the cluster allocation API:
Elasticsearch Platform — Find real-time answers at scale | Elastic

But it doesn't look very robust, e.g. you can specify that certain
nodes shouldn't have primary shards, but if a primary is missing a
replica (in your other DC) might still be promoted to the primary.

Maybe someone here has more experience with this.

On Mon, Jul 9, 2012 at 11:42 PM, Loren loren@siebert.org wrote:

I am looking to do something very similar here with two datacenters that are
about 20ms apart. All of the indexing happens in one DC, so I would prefer
all the primary shards to be in one DC and all the replication to happen
across the VPN to the second DC. Otherwise I could be sending data across
the interconnect twice.

It sounds like right now the WAL replication functionality is not yet
implemented, so this sort of master-slave cluster replication is not (yet)
available. Is that correct?