Linked several ES *clusters* allowing multi site search

Hello all,

I'm wondering if following scenari is possible with native ES or with some
over work.

Site S1 have an ES cluster with its proper data, site S2 has ots own
cluster too, with other data. We'd like to allow search over the two sites
so that one unique query matches data on the two clusters and merges
results.

A constraint we have is that data should not be shared in the two cluster,
that's why not only one cluster is used.

Is such scenario possible?

Thanks for answers.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Well, you could research the problem, then see if ES is a candidate and
then ask for design and implementation details. But you skipped the hard
parts. so ....

On Saturday, August 24, 2013 9:36:11 AM UTC-4, Raphaël Flores wrote:

Hello all,

I'm wondering if following scenari is possible with native ES or with some
over work.

Site S1 have an ES cluster with its proper data, site S2 has ots own
cluster too, with other data. We'd like to allow search over the two sites
so that one unique query matches data on the two clusters and merges
results.

A constraint we have is that data should not be shared in the two cluster,
that's why not only one cluster is used.

Is such scenario possible?

Thanks for answers.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I think the elasticsearch way of doing this is to have a single
cluster containing both datacenters.
Then, using routing limit where each data is actually indexed.

If you want 2 separate clusters, I think you would have to query both
separately and merge the results yourself manually on client side.

2013/8/26 BillyEm wmartinusa@gmail.com:

Well, you could research the problem, then see if ES is a candidate and then
ask for design and implementation details. But you skipped the hard parts.
so ....

On Saturday, August 24, 2013 9:36:11 AM UTC-4, Raphaël Flores wrote:

Hello all,

I'm wondering if following scenari is possible with native ES or with some
over work.

Site S1 have an ES cluster with its proper data, site S2 has ots own
cluster too, with other data. We'd like to allow search over the two sites
so that one unique query matches data on the two clusters and merges
results.

A constraint we have is that data should not be shared in the two cluster,
that's why not only one cluster is used.

Is such scenario possible?

Thanks for answers.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thans for your enlightenment Tlarhices, it confirms what I suspected

For such use case (search on multi-organization indexes), SolR might be a
better solution as it allows to delegate part of search to specified shards.

Anyway, I keep ES in mind for my organization only.

Thanks.

Le lundi 26 août 2013 06:06:45 UTC+2, Tlarhices a écrit :

I think the elasticsearch way of doing this is to have a single
cluster containing both datacenters.
Then, using routing limit where each data is actually indexed.
http://www.elasticsearch.org/blog/customizing-your-document-routing/

If you want 2 separate clusters, I think you would have to query both
separately and merge the results yourself manually on client side.

2013/8/26 BillyEm <wmart...@gmail.com <javascript:>>:

Well, you could research the problem, then see if ES is a candidate and
then
ask for design and implementation details. But you skipped the hard
parts.
so ....

On Saturday, August 24, 2013 9:36:11 AM UTC-4, Raphaël Flores wrote:

Hello all,

I'm wondering if following scenari is possible with native ES or with
some

over work.

Site S1 have an ES cluster with its proper data, site S2 has ots own
cluster too, with other data. We'd like to allow search over the two
sites

so that one unique query matches data on the two clusters and merges
results.

A constraint we have is that data should not be shared in the two
cluster,

that's why not only one cluster is used.

Is such scenario possible?

Thanks for answers.

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Would shard allocation/awareness settings work? You could configure a
separate index for each organization so no data is shared. Then you can
limit each index to specific nodes in the cluster so no nodes share indices
from multiple organizations (each org has their own cluster of nodes)

http://www.elasticsearch.org/guide/reference/modules/cluster/

Thanks,
Matt Weber

On Monday, August 26, 2013, Raphaël Flores wrote:

Thans for your enlightenment Tlarhices, it confirms what I suspected

For such use case (search on multi-organization indexes), SolR might be a
better solution as it allows to delegate part of search to specified shards.

Anyway, I keep ES in mind for my organization only.

Thanks.

Le lundi 26 août 2013 06:06:45 UTC+2, Tlarhices a écrit :

I think the elasticsearch way of doing this is to have a single
cluster containing both datacenters.
Then, using routing limit where each data is actually indexed.
http://www.elasticsearch.org/**blog/customizing-your-**document-routing/http://www.elasticsearch.org/blog/customizing-your-document-routing/

If you want 2 separate clusters, I think you would have to query both
separately and merge the results yourself manually on client side.

2013/8/26 BillyEm wmart...@gmail.com:

Well, you could research the problem, then see if ES is a candidate and
then
ask for design and implementation details. But you skipped the hard
parts.
so ....

On Saturday, August 24, 2013 9:36:11 AM UTC-4, Raphaël Flores wrote:

Hello all,

I'm wondering if following scenari is possible with native ES or with
some

over work.

Site S1 have an ES cluster with its proper data, site S2 has ots own
cluster too, with other data. We'd like to allow search over the two
sites

so that one unique query matches data on the two clusters and merges
results.

A constraint we have is that data should not be shared in the two
cluster,

that's why not only one cluster is used.

Is such scenario possible?

Thanks for answers.

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to elasticsearc...@**googlegroups.com.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com <javascript:_e({},
'cvml', 'elasticsearch%2Bunsubscribe@googlegroups.com');>.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey, that sounds really interesting! I missed this feature, but I still
need to give it a try.

I'll next have to check if index's settings can be updated within the admin
REST API if the given index is located on a remote node (meaning we can't
access the ES instance on this node directly), in other words if I can
update such index remotely stored, using only local ES node.

Thanks a lot Matt!

Le lundi 26 août 2013 14:30:42 UTC+2, Matt Weber a écrit :

Would shard allocation/awareness settings work? You could configure a
separate index for each organization so no data is shared. Then you can
limit each index to specific nodes in the cluster so no nodes share indices
from multiple organizations (each org has their own cluster of nodes)

http://www.elasticsearch.org/guide/reference/modules/cluster/

Thanks,
Matt Weber

On Monday, August 26, 2013, Raphaël Flores wrote:

Thans for your enlightenment Tlarhices, it confirms what I suspected

For such use case (search on multi-organization indexes), SolR might be a
better solution as it allows to delegate part of search to specified shards.

Anyway, I keep ES in mind for my organization only.

Thanks.

Le lundi 26 août 2013 06:06:45 UTC+2, Tlarhices a écrit :

I think the elasticsearch way of doing this is to have a single
cluster containing both datacenters.
Then, using routing limit where each data is actually indexed.
http://www.elasticsearch.org/**blog/customizing-your-**document-routing/http://www.elasticsearch.org/blog/customizing-your-document-routing/

If you want 2 separate clusters, I think you would have to query both
separately and merge the results yourself manually on client side.

2013/8/26 BillyEm wmart...@gmail.com:

Well, you could research the problem, then see if ES is a candidate
and then
ask for design and implementation details. But you skipped the hard
parts.
so ....

On Saturday, August 24, 2013 9:36:11 AM UTC-4, Raphaël Flores wrote:

Hello all,

I'm wondering if following scenari is possible with native ES or with
some

over work.

Site S1 have an ES cluster with its proper data, site S2 has ots own
cluster too, with other data. We'd like to allow search over the two
sites

so that one unique query matches data on the two clusters and merges
results.

A constraint we have is that data should not be shared in the two
cluster,

that's why not only one cluster is used.

Is such scenario possible?

Thanks for answers.

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to elasticsearc...@**googlegroups.com.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

"meaning we can't access the ES instance on this node directly"... I
imagine this means that the http/rest api (port 9200) is disabled on this
node but the node can still communicate with the rest of the cluster via
the transport address (ie. port 9300)? As long as the nodes can talk then
you will be fine. ES routes all operations to the correct nodes no matter
where that request originated.

Thanks,
Matt Weber

On Mon, Aug 26, 2013 at 7:57 AM, Raphaël Flores raf64flo@gmail.com wrote:

Hey, that sounds really interesting! I missed this feature, but I still
need to give it a try.

I'll next have to check if index's settings can be updated within the
admin REST API if the given index is located on a remote node (meaning we
can't access the ES instance on this node directly), in other words if I
can update such index remotely stored, using only local ES node.

Thanks a lot Matt!

Le lundi 26 août 2013 14:30:42 UTC+2, Matt Weber a écrit :

Would shard allocation/awareness settings work? You could configure a
separate index for each organization so no data is shared. Then you can
limit each index to specific nodes in the cluster so no nodes share indices
from multiple organizations (each org has their own cluster of nodes)

http://www.elasticsearch.org/**guide/reference/modules/**cluster/http://www.elasticsearch.org/guide/reference/modules/cluster/

Thanks,
Matt Weber

On Monday, August 26, 2013, Raphaël Flores wrote:

Thans for your enlightenment Tlarhices, it confirms what I suspected

For such use case (search on multi-organization indexes), SolR might be
a better solution as it allows to delegate part of search to specified
shards.

Anyway, I keep ES in mind for my organization only.

Thanks.

Le lundi 26 août 2013 06:06:45 UTC+2, Tlarhices a écrit :

I think the elasticsearch way of doing this is to have a single
cluster containing both datacenters.
Then, using routing limit where each data is actually indexed.
http://www.elasticsearch.org/blog/customizing-your-document-
routing/http://www.elasticsearch.org/blog/customizing-your-document-routing/

If you want 2 separate clusters, I think you would have to query both
separately and merge the results yourself manually on client side.

2013/8/26 BillyEm wmart...@gmail.com:

Well, you could research the problem, then see if ES is a candidate
and then
ask for design and implementation details. But you skipped the hard
parts.
so ....

On Saturday, August 24, 2013 9:36:11 AM UTC-4, Raphaël Flores wrote:

Hello all,

I'm wondering if following scenari is possible with native ES or
with some

over work.

Site S1 have an ES cluster with its proper data, site S2 has ots own
cluster too, with other data. We'd like to allow search over the two
sites

so that one unique query matches data on the two clusters and merges
results.

A constraint we have is that data should not be shared in the two
cluster,

that's why not only one cluster is used.

Is such scenario possible?

Thanks for answers.

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an
email to elasticsearc...@**googlegroups.**com.
For more options, visit https://groups.google.com/groups/opt_outhttps://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@**googlegroups.com.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Routing all operations to correct node is useful for query operation, that
is fine. What I search for is to avoid any update of index level settings
from nodes which does not store those indexes.

Not sure that I have clearly explain my situation. Here is an example:
Datacenter A (dcA is under orgA control) has 2 nodes storing an index
index1DA
Datacenter B (dcB is under orgB control) has 2 nodes storing an index
index2DB
Would like to avoid from any node in Datacenter B to change index1DA
settings stored on datacenter 1 nodes. So that following query does not
affect configuration of index:
http://node1.dcB.net:9200/index1DA/_close

The problematic I have is that dcB and dcA are not under control of same
organization. We'd like to allow querying on both datacenters, but holding
organizations may not want their data/index to be replicated somewhere
else, and want to avoid any change on index settings from nodes under the
control of other organizations.

I know such use case is weird, and we could name that much more a cluster
federation than a real cluster, since some boundaries have to be clearly
set inside this ES cluster.

Hope this is cleare. Thanks Matt.

Raphaël.

Le lundi 26 août 2013 17:13:29 UTC+2, Matt Weber a écrit :

"meaning we can't access the ES instance on this node directly"... I
imagine this means that the http/rest api (port 9200) is disabled on this
node but the node can still communicate with the rest of the cluster via
the transport address (ie. port 9300)? As long as the nodes can talk then
you will be fine. ES routes all operations to the correct nodes no matter
where that request originated.

Thanks,
Matt Weber

On Mon, Aug 26, 2013 at 7:57 AM, Raphaël Flores <raf6...@gmail.com<javascript:>

wrote:

Hey, that sounds really interesting! I missed this feature, but I still
need to give it a try.

I'll next have to check if index's settings can be updated within the
admin REST API if the given index is located on a remote node (meaning we
can't access the ES instance on this node directly), in other words if I
can update such index remotely stored, using only local ES node.

Thanks a lot Matt!

Le lundi 26 août 2013 14:30:42 UTC+2, Matt Weber a écrit :

Would shard allocation/awareness settings work? You could configure a
separate index for each organization so no data is shared. Then you can
limit each index to specific nodes in the cluster so no nodes share indices
from multiple organizations (each org has their own cluster of nodes)

http://www.elasticsearch.org/**guide/reference/modules/**cluster/http://www.elasticsearch.org/guide/reference/modules/cluster/

Thanks,
Matt Weber

On Monday, August 26, 2013, Raphaël Flores wrote:

Thans for your enlightenment Tlarhices, it confirms what I suspected

For such use case (search on multi-organization indexes), SolR might be
a better solution as it allows to delegate part of search to specified
shards.

Anyway, I keep ES in mind for my organization only.

Thanks.

Le lundi 26 août 2013 06:06:45 UTC+2, Tlarhices a écrit :

I think the elasticsearch way of doing this is to have a single
cluster containing both datacenters.
Then, using routing limit where each data is actually indexed.
http://www.elasticsearch.org/blog/customizing-your-document-
routing/http://www.elasticsearch.org/blog/customizing-your-document-routing/

If you want 2 separate clusters, I think you would have to query both
separately and merge the results yourself manually on client side.

2013/8/26 BillyEm wmart...@gmail.com:

Well, you could research the problem, then see if ES is a candidate
and then
ask for design and implementation details. But you skipped the hard
parts.
so ....

On Saturday, August 24, 2013 9:36:11 AM UTC-4, Raphaël Flores wrote:

Hello all,

I'm wondering if following scenari is possible with native ES or
with some

over work.

Site S1 have an ES cluster with its proper data, site S2 has ots
own

cluster too, with other data. We'd like to allow search over the
two sites

so that one unique query matches data on the two clusters and
merges

results.

A constraint we have is that data should not be shared in the two
cluster,

that's why not only one cluster is used.

Is such scenario possible?

Thanks for answers.

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an
email to elasticsearc...@**googlegroups.**com.
For more options, visit https://groups.google.com/groups/opt_outhttps://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@**googlegroups.com.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.