Intercommunication between machines in ElasticSearch

I'm trying to wrap my head around the issues of using the Java API and how
the internals of ElasticSearch (ES) work. In fact, the organization of a
large ES installation would be nice to have cleared up.

Let's say I wanted to create a large multinational, multilingual site.
There'd be data centers in various cities of the world, each set with the
country/language locale, each with local content. My goal is to have those
searchable from around the world, based on some small amount of translation
upon data insertion.

How would the clusters, nodes, etc be set up to talk to each other? And
what is exchanged, pure JSON? What?

You mean each data center will have its own set of data, and own set of
machines? You can have an ES cluster (or a singel node) per data center in
this case, each responsible for its own content.

On Tue, May 1, 2012 at 4:23 PM, Dennis gearond@gmail.com wrote:

I'm trying to wrap my head around the issues of using the Java API and how
the internals of Elasticsearch (ES) work. In fact, the organization of a
large ES installation would be nice to have cleared up.

Let's say I wanted to create a large multinational, multilingual site.
There'd be data centers in various cities of the world, each set with the
country/language locale, each with local content. My goal is to have those
searchable from around the world, based on some small amount of translation
upon data insertion.

How would the clusters, nodes, etc be set up to talk to each other? And
what is exchanged, pure JSON? What?

How easy would it be to make local data be master at each site, and slave
clusters of data from other locations, so searching would be done locally
on real-time-local-data and on lazily updated remote data (from other
locales)?

On Wednesday, May 2, 2012 11:27:50 AM UTC-5, kimchy wrote:

You mean each data center will have its own set of data, and own set of
machines? You can have an ES cluster (or a singel node) per data center in
this case, each responsible for its own content.

On Tue, May 1, 2012 at 4:23 PM, Dennis gearond@gmail.com wrote:

I'm trying to wrap my head around the issues of using the Java API and
how the internals of Elasticsearch (ES) work. In fact, the organization of
a large ES installation would be nice to have cleared up.

Let's say I wanted to create a large multinational, multilingual site.
There'd be data centers in various cities of the world, each set with the
country/language locale, each with local content. My goal is to have those
searchable from around the world, based on some small amount of translation
upon data insertion.

How would the clusters, nodes, etc be set up to talk to each other? And
what is exchanged, pure JSON? What?

I don't really follow what you are asking.... Maybe you could describe what
you ultimately want to (be able to) do?

Otis

Performance Monitoring for Solr / Elasticsearch / HBase -

On Wednesday, May 2, 2012 9:34:46 PM UTC-4, Dennis wrote:

How easy would it be to make local data be master at each site, and slave
clusters of data from other locations, so searching would be done locally
on real-time-local-data and on lazily updated remote data (from other
locales)?

On Wednesday, May 2, 2012 11:27:50 AM UTC-5, kimchy wrote:

You mean each data center will have its own set of data, and own set of
machines? You can have an ES cluster (or a singel node) per data center in
this case, each responsible for its own content.

On Tue, May 1, 2012 at 4:23 PM, Dennis gearond@gmail.com wrote:

I'm trying to wrap my head around the issues of using the Java API and
how the internals of Elasticsearch (ES) work. In fact, the organization of
a large ES installation would be nice to have cleared up.

Let's say I wanted to create a large multinational, multilingual site.
There'd be data centers in various cities of the world, each set with the
country/language locale, each with local content. My goal is to have those
searchable from around the world, based on some small amount of translation
upon data insertion.

How would the clusters, nodes, etc be set up to talk to each other? And
what is exchanged, pure JSON? What?

In this case, you will need to make sure to a DC will write to all other
clusters so they will have it locally as well.

On Thu, May 3, 2012 at 4:34 AM, Dennis gearond@gmail.com wrote:

How easy would it be to make local data be master at each site, and slave
clusters of data from other locations, so searching would be done locally
on real-time-local-data and on lazily updated remote data (from other
locales)?

On Wednesday, May 2, 2012 11:27:50 AM UTC-5, kimchy wrote:

You mean each data center will have its own set of data, and own set of
machines? You can have an ES cluster (or a singel node) per data center in
this case, each responsible for its own content.

On Tue, May 1, 2012 at 4:23 PM, Dennis gearond@gmail.com wrote:

I'm trying to wrap my head around the issues of using the Java API and
how the internals of Elasticsearch (ES) work. In fact, the organization of
a large ES installation would be nice to have cleared up.

Let's say I wanted to create a large multinational, multilingual site.
There'd be data centers in various cities of the world, each set with the
country/language locale, each with local content. My goal is to have those
searchable from around the world, based on some small amount of translation
upon data insertion.

How would the clusters, nodes, etc be set up to talk to each other? And
what is exchanged, pure JSON? What?

I was thinking that would be what I would have to do, Kimchy/Shay. I'll
have to revisit this later, solving more near term design problems.

On Friday, May 4, 2012 4:47:04 AM UTC-5, kimchy wrote:

In this case, you will need to make sure to a DC will write to all other
clusters so they will have it locally as well.

On Thu, May 3, 2012 at 4:34 AM, Dennis gearond@gmail.com wrote:

How easy would it be to make local data be master at each site, and slave
clusters of data from other locations, so searching would be done locally
on real-time-local-data and on lazily updated remote data (from other
locales)?

On Wednesday, May 2, 2012 11:27:50 AM UTC-5, kimchy wrote:

You mean each data center will have its own set of data, and own set of
machines? You can have an ES cluster (or a singel node) per data center in
this case, each responsible for its own content.

On Tue, May 1, 2012 at 4:23 PM, Dennis gearond@gmail.com wrote:

I'm trying to wrap my head around the issues of using the Java API and
how the internals of Elasticsearch (ES) work. In fact, the organization of
a large ES installation would be nice to have cleared up.

Let's say I wanted to create a large multinational, multilingual site.
There'd be data centers in various cities of the world, each set with the
country/language locale, each with local content. My goal is to have those
searchable from around the world, based on some small amount of translation
upon data insertion.

How would the clusters, nodes, etc be set up to talk to each other? And
what is exchanged, pure JSON? What?