Best way to duplicate data across clusters live?


(Josh Harrison) #1

Say I have clusters A and B. Cluster A is consuming data using an ActiveMQ
river. I would like to stream data to cluster B as well. Do I just create a
secondary outbound AMQ channel and subscribe cluster B to it, or is there a
decent way to have a live copy of data going two places at once?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e58efbd5-0cc0-436d-8a41-5f7987587881%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #2

It'd be simpler to do it on the AMQ end and keep your clusters separate.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 13 March 2014 05:55, Josh Harrison hijakk@gmail.com wrote:

Say I have clusters A and B. Cluster A is consuming data using an ActiveMQ
river. I would like to stream data to cluster B as well. Do I just create a
secondary outbound AMQ channel and subscribe cluster B to it, or is there a
decent way to have a live copy of data going two places at once?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e58efbd5-0cc0-436d-8a41-5f7987587881%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/e58efbd5-0cc0-436d-8a41-5f7987587881%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624b-SPXHz2_-uBzKVHUddu5Pq6H3KA9UY6S-V6Qh-hhORw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Otis Gospodnetić) #3

Consider Kafka 0.8.1. It comes with a MirrorMaker tool that mirrors Kafka
data (to multiple DCs). Once data is local, you can feed your ES from the
local Kafka broker.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wednesday, March 12, 2014 2:55:58 PM UTC-4, Josh Harrison wrote:

Say I have clusters A and B. Cluster A is consuming data using an ActiveMQ
river. I would like to stream data to cluster B as well. Do I just create a
secondary outbound AMQ channel and subscribe cluster B to it, or is there a
decent way to have a live copy of data going two places at once?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/597c9897-2746-4abb-8848-f694d6afe040%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Josh Harrison) #4

Kafka looks interesting, though at this point we're actively trying to
reduce the number of moving parts, so I think an AMQ based approach is what
we'll ultimately go for.
Seems like there might be room here for an
elasticsearch-elasticsearch-river plugin or something - to do one or two
way close to real time replication on some selected set of indexes between
separate clusters. That way you could easily mirror prod data to a dev
environment without depending on the ability to do the duplication earlier
in the pipeline, or depending on scripts to move the data around.

On Wednesday, March 12, 2014 2:46:19 PM UTC-7, Otis Gospodnetic wrote:

Consider Kafka 0.8.1. It comes with a MirrorMaker tool that mirrors Kafka
data (to multiple DCs). Once data is local, you can feed your ES from the
local Kafka broker.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wednesday, March 12, 2014 2:55:58 PM UTC-4, Josh Harrison wrote:

Say I have clusters A and B. Cluster A is consuming data using an
ActiveMQ river. I would like to stream data to cluster B as well. Do I just
create a secondary outbound AMQ channel and subscribe cluster B to it, or
is there a decent way to have a live copy of data going two places at once?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6b1bdbe4-e2fa-4b10-9298-62d3d1869842%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5