Best practice of sync index cross multiple data centers

We have ES 2.x clusters deployed on 4 different data centers on Azure.

To ingest data to all ES clusters on different data center, we firstly push the index data to a centralized data source (either DB or Azure Event Hub) and then we have a microservice built on each data center to ingest the data from the data source to ES cluster.

The problem is that, if any one of the data center crash (either microsserice or ES cluster) for some reason, the index might be out of sync between data centers.

Is there any better way to make sure the index be synced cross data centers? Any thoughts could be helpful, not just limited to solution on Azure.

Using a message queue to store events and then index these separately into each cluster is currently the recommended best practice, which seems to be what you are doing. I do unfortunately not have any experience with Azure Event Hub, so do not know how to best use it in this kind of setup. This blog post however contains a discussion around using Kafka to achieve this, which may be useful.

1 Like