Cross DC Cluster and Data Replication

Hi All

We are in the process of establishing Elastic Search on enterprise scale which will span servers spread across two primary data centers located in US and Europe. We are using 5.5.x version and our goal is to have a single view of enterprise data spread across both DCs.

I am trying to ascertain the right approach to go about this.

Will a single cluster spanning DCs be feasible ?
If not, what kind of replication scheme be used for replicating data between the clusters located in each DC ?
Do you recommend using Kafka as a queuing mechanism or use logstash's own persistent queues instead ?

Senthil Nathan M


Depends what you really need here.

Again, depends on your needs. What is the replication actually hoping to achieve?

We are working on functionality to allow native cross cluster replication. It should be available in the near future.

Thanks for the prompt response Mark.

Our primary objective here is to have an elastic search environment that can aggregate various log files from all servers in both data centers (US and EU).
We are looking for a unified view of all the collected log messages (via kibana or similar interface).
The search and indexing latency must be minimal.

I looked at this page and felt "Independent Elasticsearch and Kafka Clusters" option suited our requirements better.

For our requirements, does having a dedicated Elastic Search cluster and kafka instance in each data center be appropriate ? If so, can I use a simple logstash service to replicate data from remote DC ?
That page was over 2 years old and with elastic search having added more features now, I wanted a fresh perspective on the architecture discussed there. With the features available in 5.5.x, is the architecture discussed in the page still viable ?

Why not use 6.X and cross cluster search?

Thanks Mark. I'll look into the feasibility of us upgrading to 6.x.

Meanwhile, if there is something we can accomplish with 5.5.x, kindly let me know.

Appreciate your prompt response.

Senthil Nathan M

Hi Mark

Our planned setup works like this.

America DC:

(1) Client Nodes
(2) Logstash Node (the logs generated by all clients in this DC is processed by this logstash node)

Europe DC:

(1) Client Nodes
(2) Logstash Nodes (the logs generated by all clients in this DC is processed by this logstash node)
(3) Elastic Search Server
(4) Elastic Search Database
(5) Kibana

The data collected by logstash in America DC will be transferred to the Elastic Search Server in Europe DC.

Also, the elastic search volume is expected to expand rapidly this year,

Do you foresee any problems with this architecture that would affect elastic search performance or is it sustainable ?

Senthil Nathan M

That looks ok.

Thank you Mark for the prompt response.

I am happy to close this thread as I got the answer I was looking for.

Senthil Nathan M

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.