Geo-distributed cluster


(jeangld) #1

Hi,

I was able to set up my local mini cluster and ES is finding the
nearby machines in my local network and redistributes the data
according to my settings. I've also read about the Amazon EC2
instances, but I'm thinking about following situation:

I have 2 dedicated servers in Europe, two in USA and two in Asia. I
plan to distribute my ES index to those locations. If I set my
replicas to 2, I intend to end up wit a copy in each location. So, the
two servers in each location distribute the load, and in every
continent a full copy is available. If one server (in USA) goes down,
ES would find the necessary data (with a higher network latency)
either in Europe or in Asia.

Is it possible to configure ES in such a way, that I set the 6 IP
adresses of each server in a config file and it knows how to
distribute accordingly? From what I've read when I searched this was
planned in some future version, but I don't know how current this
information is.

Thanks,

Jean


(Otis Gospodnetić) #2

Jean,

Look up Allocation Awareness on ES site + the thread I started on that
topic on the ML recently.

Otis

Search Analytics - http://sematext.com/search-analytics/index.html
Scalable Performance Monitoring - http://sematext.com/spm/index.html

On Feb 21, 10:10 am, "jean...@yahoo.com" jean...@yahoo.com wrote:

Hi,

I was able to set up my local mini cluster and ES is finding the
nearby machines in my local network and redistributes the data
according to my settings. I've also read about the Amazon EC2
instances, but I'm thinking about following situation:

I have 2 dedicated servers in Europe, two in USA and two in Asia. I
plan to distribute my ES index to those locations. If I set my
replicas to 2, I intend to end up wit a copy in each location. So, the
two servers in each location distribute the load, and in every
continent a full copy is available. If one server (in USA) goes down,
ES would find the necessary data (with a higher network latency)
either in Europe or in Asia.

Is it possible to configure ES in such a way, that I set the 6 IP
adresses of each server in a config file and it knows how to
distribute accordingly? From what I've read when I searched this was
planned in some future version, but I don't know how current this
information is.

Thanks,

Jean


(Berkay Mollamustafaoglu-2) #3

I think having a cluster distributed across data centers is not a good
idea. One of the most problematic things that can happen with ES is when
the nodes loose communication with each other, what is referred to as
"split brain". It's a lot more likely to have connectivity issues when
cluster spans across data centers. Latency may also cause potential
performance problems
You'd be better off to have a separate cluster at each data center and use
the application layer to index the data at each cluster, rather than
relying on ES to replicate the data.

My 3.1415 cents ...

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Tue, Feb 21, 2012 at 10:10 AM, jeangld@yahoo.com jeangld@yahoo.comwrote:

Hi,

I was able to set up my local mini cluster and ES is finding the
nearby machines in my local network and redistributes the data
according to my settings. I've also read about the Amazon EC2
instances, but I'm thinking about following situation:

I have 2 dedicated servers in Europe, two in USA and two in Asia. I
plan to distribute my ES index to those locations. If I set my
replicas to 2, I intend to end up wit a copy in each location. So, the
two servers in each location distribute the load, and in every
continent a full copy is available. If one server (in USA) goes down,
ES would find the necessary data (with a higher network latency)
either in Europe or in Asia.

Is it possible to configure ES in such a way, that I set the 6 IP
adresses of each server in a config file and it knows how to
distribute accordingly? From what I've read when I searched this was
planned in some future version, but I don't know how current this
information is.

Thanks,

Jean


(system) #4