Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if I
want to sync it between 2 datacenters?
10x
Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if I
want to sync it between 2 datacenters?
10x
Hi,
There are several ways to try and solve the "data center problem". In
short, elasticsearch is not data center aware. If you want to sync between
two data centers, you need to do it manually. How do you solve the two data
centers problem with your data storage? Maybe based on that I can help.
-shay.banon
On Thu, Mar 25, 2010 at 10:38 PM, Ori Lahav olahav@gmail.com wrote:
Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if I
want to sync it between 2 datacenters?10x
So... It depends what data storage:
I think something along the lines of Cassandra awareness might be great.
do you have any plans for this feature?
On Thu, Mar 25, 2010 at 11:11 PM, Shay Banon
shay.banon@elasticsearch.comwrote:
Hi,
There are several ways to try and solve the "data center problem". In
short, elasticsearch is not data center aware. If you want to sync between
two data centers, you need to do it manually. How do you solve the two data
centers problem with your data storage? Maybe based on that I can help.-shay.banon
On Thu, Mar 25, 2010 at 10:38 PM, Ori Lahav olahav@gmail.com wrote:
Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if I
want to sync it between 2 datacenters?10x
Do you handle cases where the two data centers have conflicting updates?
Cassandra "can" handle it, the others I am not that sure... . What exactly
are you after with two data centers? One active and one backup, with reads
going local to each?
On Thu, Mar 25, 2010 at 11:36 PM, Ori Lahav olahav@gmail.com wrote:
So... It depends what data storage:
- MySql have it's own replication mechanism.
- Solr - same thing - have it's own replication.
- MogileFS is DC aware and can send X replicas of the data to second DC.
- Cassandra and Hadoop are DC aware.
I think something along the lines of Cassandra awareness might be great.
do you have any plans for this feature?
On Thu, Mar 25, 2010 at 11:11 PM, Shay Banon <shay.banon@elasticsearch.com
wrote:
Hi,
There are several ways to try and solve the "data center problem". In
short, elasticsearch is not data center aware. If you want to sync between
two data centers, you need to do it manually. How do you solve the two data
centers problem with your data storage? Maybe based on that I can help.-shay.banon
On Thu, Mar 25, 2010 at 10:38 PM, Ori Lahav olahav@gmail.com wrote:
Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if I
want to sync it between 2 datacenters?10x
Yes, that can be a great start,
update the index in one "master" datacenter and then replicate it to the
second one. reads are done on both DCs.
I want also to keep the option that if the "master" DC fails I can move the
writes to the second one.
10x
On Thu, Mar 25, 2010 at 11:38 PM, Shay Banon
shay.banon@elasticsearch.comwrote:
Do you handle cases where the two data centers have conflicting updates?
Cassandra "can" handle it, the others I am not that sure... . What exactly
are you after with two data centers? One active and one backup, with reads
going local to each?On Thu, Mar 25, 2010 at 11:36 PM, Ori Lahav olahav@gmail.com wrote:
So... It depends what data storage:
- MySql have it's own replication mechanism.
- Solr - same thing - have it's own replication.
- MogileFS is DC aware and can send X replicas of the data to second DC.
- Cassandra and Hadoop are DC aware.
I think something along the lines of Cassandra awareness might be great.
do you have any plans for this feature?
On Thu, Mar 25, 2010 at 11:11 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:Hi,
There are several ways to try and solve the "data center problem". In
short, elasticsearch is not data center aware. If you want to sync between
two data centers, you need to do it manually. How do you solve the two data
centers problem with your data storage? Maybe based on that I can help.-shay.banon
On Thu, Mar 25, 2010 at 10:38 PM, Ori Lahav olahav@gmail.com wrote:
Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if I
want to sync it between 2 datacenters?10x
You did not answer my question :). How do you handle it today? Do you handle
conflict updates?
Regarding elasticsearch, then yes, I do plan to support various models:
Note, you can do it today quite easily on the "client" side. The code you
use to index data, make sure it applies to both data centers (queue it, or
something similar).
Both are not that difficult to implement thanks to how elasticsearch is
designed.
-shay.banon
On Thu, Mar 25, 2010 at 11:43 PM, Ori Lahav olahav@gmail.com wrote:
Yes, that can be a great start,
update the index in one "master" datacenter and then replicate it to the
second one. reads are done on both DCs.
I want also to keep the option that if the "master" DC fails I can move the
writes to the second one.10x
On Thu, Mar 25, 2010 at 11:38 PM, Shay Banon <shay.banon@elasticsearch.com
wrote:
Do you handle cases where the two data centers have conflicting updates?
Cassandra "can" handle it, the others I am not that sure... . What exactly
are you after with two data centers? One active and one backup, with reads
going local to each?On Thu, Mar 25, 2010 at 11:36 PM, Ori Lahav olahav@gmail.com wrote:
So... It depends what data storage:
- MySql have it's own replication mechanism.
- Solr - same thing - have it's own replication.
- MogileFS is DC aware and can send X replicas of the data to second DC.
- Cassandra and Hadoop are DC aware.
I think something along the lines of Cassandra awareness might be great.
do you have any plans for this feature?
On Thu, Mar 25, 2010 at 11:11 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:Hi,
There are several ways to try and solve the "data center problem". In
short, elasticsearch is not data center aware. If you want to sync between
two data centers, you need to do it manually. How do you solve the two data
centers problem with your data storage? Maybe based on that I can help.-shay.banon
On Thu, Mar 25, 2010 at 10:38 PM, Ori Lahav olahav@gmail.com wrote:
Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if I
want to sync it between 2 datacenters?10x
so... basically we have no need to handle conflicts as writes are being done
at only one DC and replicated to the other.
will be happy to hear about it when we will meet
On Thu, Mar 25, 2010 at 11:53 PM, Shay Banon
shay.banon@elasticsearch.comwrote:
You did not answer my question :). How do you handle it today? Do you
handle conflict updates?Regarding elasticsearch, then yes, I do plan to support various models:
- Two completely separate clusters, that replicate changes to the other
cluster. Reads / Search will go to the local datacenter by "default", since
you configure the search / read clients on each data center (your web tier
or something similar to work against the local cluster).Note, you can do it today quite easily on the "client" side. The code you
use to index data, make sure it applies to both data centers (queue it, or
something similar).
- A single cluster that spans two data centers, with special allocation
strategy that make sure that a shard and its replica do not exists on the
same data center. And that read / search prefer "local" data center shards
then going to search on another data center.Both are not that difficult to implement thanks to how elasticsearch is
designed.-shay.banon
On Thu, Mar 25, 2010 at 11:43 PM, Ori Lahav olahav@gmail.com wrote:
Yes, that can be a great start,
update the index in one "master" datacenter and then replicate it to the
second one. reads are done on both DCs.
I want also to keep the option that if the "master" DC fails I can move
the writes to the second one.10x
On Thu, Mar 25, 2010 at 11:38 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:Do you handle cases where the two data centers have conflicting updates?
Cassandra "can" handle it, the others I am not that sure... . What exactly
are you after with two data centers? One active and one backup, with reads
going local to each?On Thu, Mar 25, 2010 at 11:36 PM, Ori Lahav olahav@gmail.com wrote:
So... It depends what data storage:
- MySql have it's own replication mechanism.
- Solr - same thing - have it's own replication.
- MogileFS is DC aware and can send X replicas of the data to second DC.
- Cassandra and Hadoop are DC aware.
I think something along the lines of Cassandra awareness might be great.
do you have any plans for this feature?
On Thu, Mar 25, 2010 at 11:11 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:Hi,
There are several ways to try and solve the "data center problem".
In short, elasticsearch is not data center aware. If you want to sync
between two data centers, you need to do it manually. How do you solve the
two data centers problem with your data storage? Maybe based on that I can
help.-shay.banon
On Thu, Mar 25, 2010 at 10:38 PM, Ori Lahav olahav@gmail.com wrote:
Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if I
want to sync it between 2 datacenters?10x
"A single cluster that spans two data centers, with special allocation
strategy" +1 This would be great.
Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype
On Thu, Mar 25, 2010 at 5:53 PM, Shay Banon shay.banon@elasticsearch.comwrote:
You did not answer my question :). How do you handle it today? Do you
handle conflict updates?Regarding elasticsearch, then yes, I do plan to support various models:
- Two completely separate clusters, that replicate changes to the other
cluster. Reads / Search will go to the local datacenter by "default", since
you configure the search / read clients on each data center (your web tier
or something similar to work against the local cluster).Note, you can do it today quite easily on the "client" side. The code you
use to index data, make sure it applies to both data centers (queue it, or
something similar).
- A single cluster that spans two data centers, with special allocation
strategy that make sure that a shard and its replica do not exists on the
same data center. And that read / search prefer "local" data center shards
then going to search on another data center.Both are not that difficult to implement thanks to how elasticsearch is
designed.-shay.banon
On Thu, Mar 25, 2010 at 11:43 PM, Ori Lahav olahav@gmail.com wrote:
Yes, that can be a great start,
update the index in one "master" datacenter and then replicate it to the
second one. reads are done on both DCs.
I want also to keep the option that if the "master" DC fails I can move
the writes to the second one.10x
On Thu, Mar 25, 2010 at 11:38 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:Do you handle cases where the two data centers have conflicting updates?
Cassandra "can" handle it, the others I am not that sure... . What exactly
are you after with two data centers? One active and one backup, with reads
going local to each?On Thu, Mar 25, 2010 at 11:36 PM, Ori Lahav olahav@gmail.com wrote:
So... It depends what data storage:
- MySql have it's own replication mechanism.
- Solr - same thing - have it's own replication.
- MogileFS is DC aware and can send X replicas of the data to second DC.
- Cassandra and Hadoop are DC aware.
I think something along the lines of Cassandra awareness might be great.
do you have any plans for this feature?
On Thu, Mar 25, 2010 at 11:11 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:Hi,
There are several ways to try and solve the "data center problem".
In short, elasticsearch is not data center aware. If you want to sync
between two data centers, you need to do it manually. How do you solve the
two data centers problem with your data storage? Maybe based on that I can
help.-shay.banon
On Thu, Mar 25, 2010 at 10:38 PM, Ori Lahav olahav@gmail.com wrote:
Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if I
want to sync it between 2 datacenters?10x
Ahh, life is simple :). This is a much simpler case to solve.
-shay.banon
On Thu, Mar 25, 2010 at 11:59 PM, Ori Lahav olahav@gmail.com wrote:
so... basically we have no need to handle conflicts as writes are being
done at only one DC and replicated to the other.
will be happy to hear about it when we will meetOn Thu, Mar 25, 2010 at 11:53 PM, Shay Banon <shay.banon@elasticsearch.com
wrote:
You did not answer my question :). How do you handle it today? Do you
handle conflict updates?Regarding elasticsearch, then yes, I do plan to support various models:
- Two completely separate clusters, that replicate changes to the other
cluster. Reads / Search will go to the local datacenter by "default", since
you configure the search / read clients on each data center (your web tier
or something similar to work against the local cluster).Note, you can do it today quite easily on the "client" side. The code you
use to index data, make sure it applies to both data centers (queue it, or
something similar).
- A single cluster that spans two data centers, with special allocation
strategy that make sure that a shard and its replica do not exists on the
same data center. And that read / search prefer "local" data center shards
then going to search on another data center.Both are not that difficult to implement thanks to how elasticsearch is
designed.-shay.banon
On Thu, Mar 25, 2010 at 11:43 PM, Ori Lahav olahav@gmail.com wrote:
Yes, that can be a great start,
update the index in one "master" datacenter and then replicate it to the
second one. reads are done on both DCs.
I want also to keep the option that if the "master" DC fails I can move
the writes to the second one.10x
On Thu, Mar 25, 2010 at 11:38 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:Do you handle cases where the two data centers have conflicting updates?
Cassandra "can" handle it, the others I am not that sure... . What exactly
are you after with two data centers? One active and one backup, with reads
going local to each?On Thu, Mar 25, 2010 at 11:36 PM, Ori Lahav olahav@gmail.com wrote:
So... It depends what data storage:
- MySql have it's own replication mechanism.
- Solr - same thing - have it's own replication.
- MogileFS is DC aware and can send X replicas of the data to second
DC.- Cassandra and Hadoop are DC aware.
I think something along the lines of Cassandra awareness might be
great.do you have any plans for this feature?
On Thu, Mar 25, 2010 at 11:11 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:Hi,
There are several ways to try and solve the "data center problem".
In short, elasticsearch is not data center aware. If you want to sync
between two data centers, you need to do it manually. How do you solve the
two data centers problem with your data storage? Maybe based on that I can
help.-shay.banon
On Thu, Mar 25, 2010 at 10:38 PM, Ori Lahav olahav@gmail.com wrote:
Hi Shay.
Just a question.
Does ES is datacenter aware?
Does it support some kind of replication of the index or the data if
I
want to sync it between 2 datacenters?10x
Bringing this old topic back up a bit...
We have done multi-datacenter deployments of Solr and replicate across them;
initially by doing snapshots of changes made before optimize, later with
file system replication, but then finally by having a transaction log that
feeds solr that is replayed on the 2nd data center where indexing is
performed on its own.
The log files therefore served doubly to allow quick re-indexing without
going to source (if they are retained). Something worth thinking about for
ES. Have a schema change? Reindex by replaying the logs at a much higher
rate than the ingestion system might be able to start from the raw source.
This also helps in the case where ES is the only store (other than raw
source material such as files) and you want to trust you have a quicker
rebuild. And it helps if you have another source such as a DB where you may
not have a quick way to bulk export for an full reindex.
(note: This log is obviously pre-analyzer and consists basically of input
documents)
I can quickly add this to my fork of ES and see how it plays (assuming that
the client-side writes logs from multiple writers, and a River would consume
them on ES by merge sorting transactions from the many logs in bulk)
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/ES-and-multiple-datacenters-tp551665p1860682.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.
Shay - This is pretty old thread and wondering if there are features in ES now for multi data center deployment
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.