Remote asynchronous read replica - is it possible?

tomekit · March 1, 2017, 11:49am

Hi,

I've got a multi-DC setup (East Asia, EU, US).
EU is a central location where there is a cluster of 3 master nodes. All writes are performed to the EU location.

There is just one ES data node in each remote DC (US, East Asia).

In order for these remote nodes to be part of a cluster when there are network problems, I've got this in the config:
discovery.zen.fd.ping_interval: 15s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 5

... but this is not a solution. If there is a network interruption between the master DC (EU) and remote DC the cluster just stops, it's not possible to index a document. The index latency skyrockets, client application gets on hold.

I believe that this is because the writes are synchronous. When one of the cluster nodes is unresponsive, the cluster waits for that node to respond until it gets "kicked out". But the unresponsive node won't get kicked out because of the zen.fd settings which requires 5 ping retries, each of them taking 60s, giving total of 5 minutes. That's way too long.

Is it possible to set up ES natively, so it does replicate things to remote node / cluster?
The write-availability in EU is really important here.
The read-availability in remote DC is important as well, but if data is few minutes old no one would care anyway.

If it's not possible to achieve it with the ES native setup what "out of the box" options would you recommend?
I've seen some Kafka solutions, but actually I would prefer RabbitMQ, as I don't want add more complexity to manage to my stack.

Thanks !

dadoonet · March 1, 2017, 12:15pm

Cross DC replication is not supported yet.

Using something like RabbitMQ could work in the meantime.

tomekit · March 1, 2017, 12:46pm

I like RabbitMQ solution, but instead of crafting the solution and later managing it, I would rather use something off the shelf.

Following on my question.
Assuming I have 5 nodes. The quroum is 3. There are 4 replicas, which means that each node has full dataset. Min master nodes is set to 2.

Would it be possible for me to set the:

write availability to quorum,
read availability to 1 or zero (on the remote nodes).

Would it allow me to "always write" assuming there are 3 nodes running, out of which 2 needs to be master?
Would it allow me to "always read" from the remote nodes even if they're not connected to other nodes in the cluster / including masters?

I still hope that with a smart play of all these parameters it may be possible to achieve what I am looking for.

Do you know when there are plans to support cross DC replication?

dadoonet · March 1, 2017, 1:09pm

I believe that lot of answers can be found in https://www.elastic.co/guide/en/elasticsearch/reference/5.2/docs-replication.html

Look also at https://www.elastic.co/guide/en/elasticsearch/reference/5.2/docs-index_.html#index-wait-for-active-shards about write operations.

Read can always happen even if you are missing some shards. You will just get less results than with all shards.

As long as you have a primary shard you can write to it. Unless you set wait_for_active_shards.

To perform whatever operation the cluster needs to have on master node. If you don't have a master node, reads and writes will be rejected.

To achieve what you are looking for, don't split your cluster across multiple geo locations.
It's not supported.

I don't know any date regarding XDCR but I know that it will take a loooooong time to get there. So don't count on it now. BTW it will probably be part of X-Pack (license based) as per this comment: https://github.com/elastic/elasticsearch/issues/21209#issuecomment-258619371

system · March 29, 2017, 1:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.