I'd like to ask for advice about deployment in multi DC scenario.
Currently we operate on 2 Data Centers in active/standby mode. like to
opeIn case of ES we'd like to have different approach - we'drate in
active-active mode (we want to optimize our resources especially for
querying).
Here are some details about target configuration:
4 ES instances per DC. Full cluster will have 8 instances.
Up to 1 TB of data
Data pulled from database using JDBC River
Database is replicated asynchronously between DCs. Each DC will have
its own database instance to pull data.
Average latency between DCs is about several miliseconds
We need to operate when passive DC is down
We know that multi DC configuration might end with Split Brain issue. Here
is how we want to prevent it:
Set node.master: true only in 4 nodes in active DC
Set node.master: false in passive DC
This way we'll be sure that new cluster will not be created in passive
DC
Additionally we'd like to set discovery.zen.minimum_master_nodes: 3
(to avoid Split Brain in active DC)
Additionally there is problem with switchover (passive DC becomes active
and active becomes passive). In our system it takes about 20 minutes and
this is the maximum length of our maintenance window. We were thinking of
shutting down whole ES cluster and switch node.master setting in
configuration files (as far as I know this settings can not be changed via
REST api). Then we'd need to start whole cluster.
So my question is: is it better to have one big ES cluster operating on
both DCs or should we change our approach and create 2 separate clusters
(and rely on database replication)? I'd be grateful for advice.
Go the latter method and have two clusters, ES can be very sensitive to
network latency and you'll likely end up with more problems than it is
worth.
Given you already have the data source of truth being replicated, it's the
sanest option to just read that locally.
I'd like to ask for advice about deployment in multi DC scenario.
Currently we operate on 2 Data Centers in active/standby mode. like to
opeIn case of ES we'd like to have different approach - we'drate in
active-active mode (we want to optimize our resources especially for
querying).
Here are some details about target configuration:
4 ES instances per DC. Full cluster will have 8 instances.
Up to 1 TB of data
Data pulled from database using JDBC River
Database is replicated asynchronously between DCs. Each DC will have
its own database instance to pull data.
Average latency between DCs is about several miliseconds
We need to operate when passive DC is down
We know that multi DC configuration might end with Split Brain issue. Here
is how we want to prevent it:
Set node.master: true only in 4 nodes in active DC
Set node.master: false in passive DC
This way we'll be sure that new cluster will not be created in
passive DC
Additionally we'd like to set discovery.zen.minimum_master_nodes: 3
(to avoid Split Brain in active DC)
Additionally there is problem with switchover (passive DC becomes active
and active becomes passive). In our system it takes about 20 minutes and
this is the maximum length of our maintenance window. We were thinking of
shutting down whole ES cluster and switch node.master setting in
configuration files (as far as I know this settings can not be changed via
REST api). Then we'd need to start whole cluster.
So my question is: is it better to have one big ES cluster operating on
both DCs or should we change our approach and create 2 separate clusters
(and rely on database replication)? I'd be grateful for advice.
Thanks for the answer! We've been talking with several other teams in our
company and it looks like this is the most recommended and stable setup.
Regards
Sebastian
W dniu środa, 7 maja 2014 03:23:43 UTC+2 użytkownik Mark Walkom napisał:
Go the latter method and have two clusters, ES can be very sensitive to
network latency and you'll likely end up with more problems than it is
worth.
Given you already have the data source of truth being replicated, it's the
sanest option to just read that locally.
I'd like to ask for advice about deployment in multi DC scenario.
Currently we operate on 2 Data Centers in active/standby mode. like to
opeIn case of ES we'd like to have different approach - we'drate in
active-active mode (we want to optimize our resources especially for
querying).
Here are some details about target configuration:
4 ES instances per DC. Full cluster will have 8 instances.
Up to 1 TB of data
Data pulled from database using JDBC River
Database is replicated asynchronously between DCs. Each DC will
have its own database instance to pull data.
Average latency between DCs is about several miliseconds
We need to operate when passive DC is down
We know that multi DC configuration might end with Split Brain issue.
Here is how we want to prevent it:
Set node.master: true only in 4 nodes in active DC
Set node.master: false in passive DC
This way we'll be sure that new cluster will not be created in
passive DC
Additionally we'd like to set discovery.zen.minimum_master_nodes: 3
(to avoid Split Brain in active DC)
Additionally there is problem with switchover (passive DC becomes active
and active becomes passive). In our system it takes about 20 minutes and
this is the maximum length of our maintenance window. We were thinking of
shutting down whole ES cluster and switch node.master setting in
configuration files (as far as I know this settings can not be changed via
REST api). Then we'd need to start whole cluster.
So my question is: is it better to have one big ES cluster operating on
both DCs or should we change our approach and create 2 separate clusters
(and rely on database replication)? I'd be grateful for advice.
Having a separate cluster is definitely a better way to go. OR, you can
control the shard, replica placement so that they are always placed in the
same DC. In this way, you can avoid interDC issues still having a single
cluster. I have the similar issue and I am looking at it as one of the
alternative.
On Saturday, May 10, 2014 1:05:08 AM UTC-7, Sebastian Łaskawiec wrote:
Thanks for the answer! We've been talking with several other teams in our
company and it looks like this is the most recommended and stable setup.
Regards
Sebastian
W dniu środa, 7 maja 2014 03:23:43 UTC+2 użytkownik Mark Walkom napisał:
Go the latter method and have two clusters, ES can be very sensitive to
network latency and you'll likely end up with more problems than it is
worth.
Given you already have the data source of truth being replicated, it's
the sanest option to just read that locally.
I'd like to ask for advice about deployment in multi DC scenario.
Currently we operate on 2 Data Centers in active/standby mode. like to
opeIn case of ES we'd like to have different approach - we'drate in
active-active mode (we want to optimize our resources especially for
querying).
Here are some details about target configuration:
4 ES instances per DC. Full cluster will have 8 instances.
Up to 1 TB of data
Data pulled from database using JDBC River
Database is replicated asynchronously between DCs. Each DC will
have its own database instance to pull data.
Average latency between DCs is about several miliseconds
We need to operate when passive DC is down
We know that multi DC configuration might end with Split Brain issue.
Here is how we want to prevent it:
Set node.master: true only in 4 nodes in active DC
Set node.master: false in passive DC
This way we'll be sure that new cluster will not be created in
passive DC
Additionally we'd like to set discovery.zen.minimum_master_nodes:
3 (to avoid Split Brain in active DC)
Additionally there is problem with switchover (passive DC becomes active
and active becomes passive). In our system it takes about 20 minutes and
this is the maximum length of our maintenance window. We were thinking of
shutting down whole ES cluster and switch node.master setting in
configuration files (as far as I know this settings can not be changed via
REST api). Then we'd need to start whole cluster.
So my question is: is it better to have one big ES cluster operating on
both DCs or should we change our approach and create 2 separate clusters
(and rely on database replication)? I'd be grateful for advice.
Having a separate cluster is definitely a better way to go. OR, you can
control the shard, replica placement so that they are always placed in the
same DC. In this way, you can avoid interDC issues still having a single
cluster. I have the similar issue and I am looking at it as one of the
alternative.
On Saturday, May 10, 2014 1:05:08 AM UTC-7, Sebastian Łaskawiec wrote:
Thanks for the answer! We've been talking with several other teams in our
company and it looks like this is the most recommended and stable setup.
Regards
Sebastian
W dniu środa, 7 maja 2014 03:23:43 UTC+2 użytkownik Mark Walkom napisał:
Go the latter method and have two clusters, ES can be very sensitive to
network latency and you'll likely end up with more problems than it is
worth.
Given you already have the data source of truth being replicated, it's
the sanest option to just read that locally.
I'd like to ask for advice about deployment in multi DC scenario.
Currently we operate on 2 Data Centers in active/standby mode. like to
opeIn case of ES we'd like to have different approach - we'drate in
active-active mode (we want to optimize our resources especially for
querying).
Here are some details about target configuration:
4 ES instances per DC. Full cluster will have 8 instances.
Up to 1 TB of data
Data pulled from database using JDBC River
Database is replicated asynchronously between DCs. Each DC will
have its own database instance to pull data.
Average latency between DCs is about several miliseconds
We need to operate when passive DC is down
We know that multi DC configuration might end with Split Brain issue.
Here is how we want to prevent it:
Set node.master: true only in 4 nodes in active DC
Set node.master: false in passive DC
This way we'll be sure that new cluster will not be created in
passive DC
Additionally we'd like to set discovery.zen.minimum_master_nodes:
3 (to avoid Split Brain in active DC)
Additionally there is problem with switchover (passive DC becomes
active and active becomes passive). In our system it takes about 20 minutes
and this is the maximum length of our maintenance window. We were thinking
of shutting down whole ES cluster and switch node.master setting in
configuration files (as far as I know this settings can not be changed via
REST api). Then we'd need to start whole cluster.
So my question is: is it better to have one big ES cluster operating on
both DCs or should we change our approach and create 2 separate clusters
(and rely on database replication)? I'd be grateful for advice.
We are still thinking about production configuration and here is a short
list of single/separate cluster's advantages and disadvantages...
Single cluster:
(+) If you have single cluster - you perform single query to the
database. In case of having cluster per DC - each cluster needs to query DB
separately
(+) Data consistency - in the matter of fact this is achieved by
single query to the DB
(+) You can introduce new DC easily
(+) True active-active configuration
(-) Split brain and pretty complicated configuration (to avoid split
brain in case when DC link is down)
(-) node.master setting can not be changed in runtime (take a look at
my first post and split brain solution)
(-) In case of a disaster we need to operate on single DC. If you use
single cluster per 2 DCs you can't really tell if a single DC is strong
enough to handle query and indexing load
(-) In pessimistic scenario data travels through WAN 2 times (first
time - database replication, second time - ES replication)
(-) You can't really tell which node will respond to the query. Let's
assume that you have full index in each DC (force awareness option). ES
might decide to gather results from the remote DC and not from the local
one. This way you need to add WAN latency into your query time.
(-) You need to turn off whole cluster or perform cycle restarts
during upgrade
Separate cluster per DC:
(+) No Split brain
(+) You can tell precisely when you are out of resources to handle
load in ES cluster in each DC
(+) You can experiment with different settings on production. If
something goes wrong - just switch clients to standby DC.
(+) Full failover - in case of any problems - just switch to the other
DC
(+) Upgrades are easy and you have no down time (upgrade first DC,
stabilize it, test it, and then to the same to the other DC)
(+) Since these are 2 separate clusters you can avoid data traveling
through WAN during queries. Each DC queries nodes locally.
(-) It is not a full active-active configuration. It's more like an
active-standby configuration
(-) Data inconsistency might occur (different results when queried
local and remote DC)
(-) Each DC will query DB separately. This will generate additional
load to the DB
Right now we think we should go for 2 separate clusters. DB load is a thing
which worries me the most (we have really complicated query with a lot of
left joins). However we think that in our case having to separate DC have
more advantages then disadvantages.
If you have some more arguments or comments - please let us know
Regards
Sebastian
W dniu poniedziałek, 12 maja 2014 20:02:35 UTC+2 użytkownik Deepak Jha
napisał:
Having a separate cluster is definitely a better way to go. OR, you can
control the shard, replica placement so that they are always placed in the
same DC. In this way, you can avoid interDC issues still having a single
cluster. I have the similar issue and I am looking at it as one of the
alternative.
On Saturday, May 10, 2014 1:05:08 AM UTC-7, Sebastian Łaskawiec wrote:
Thanks for the answer! We've been talking with several other teams in our
company and it looks like this is the most recommended and stable setup.
Regards
Sebastian
W dniu środa, 7 maja 2014 03:23:43 UTC+2 użytkownik Mark Walkom napisał:
Go the latter method and have two clusters, ES can be very sensitive to
network latency and you'll likely end up with more problems than it is
worth.
Given you already have the data source of truth being replicated, it's
the sanest option to just read that locally.
I'd like to ask for advice about deployment in multi DC scenario.
Currently we operate on 2 Data Centers in active/standby mode. like to
opeIn case of ES we'd like to have different approach - we'drate in
active-active mode (we want to optimize our resources especially for
querying).
Here are some details about target configuration:
4 ES instances per DC. Full cluster will have 8 instances.
Up to 1 TB of data
Data pulled from database using JDBC River
Database is replicated asynchronously between DCs. Each DC will
have its own database instance to pull data.
Average latency between DCs is about several miliseconds
We need to operate when passive DC is down
We know that multi DC configuration might end with Split Brain issue.
Here is how we want to prevent it:
Set node.master: true only in 4 nodes in active DC
Set node.master: false in passive DC
This way we'll be sure that new cluster will not be created in
passive DC
Additionally we'd like to set discovery.zen.minimum_master_nodes:
3 (to avoid Split Brain in active DC)
Additionally there is problem with switchover (passive DC becomes
active and active becomes passive). In our system it takes about 20 minutes
and this is the maximum length of our maintenance window. We were thinking
of shutting down whole ES cluster and switch node.master setting in
configuration files (as far as I know this settings can not be changed via
REST api). Then we'd need to start whole cluster.
So my question is: is it better to have one big ES cluster operating on
both DCs or should we change our approach and create 2 separate clusters
(and rely on database replication)? I'd be grateful for advice.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.