Elasticsearch in different zone

amanjain · April 30, 2020, 12:04pm

Hello,

I want to confirm about the performance of Elasticsearch if data is separated among different zone but same region.
I have tried to install in different zone but the performance (latency) was very bad.
Data shards in same zone: 100ms
Data shards in different zone: 10 seconds
I also want your opinion if ES-master and ES client can be in different zone than ES data nodes or not.

Cloud provider: Google cloud platform

Thanks

Emanuil · April 30, 2020, 3:31pm

Hi! Welcome to the community :).

I can't really comment on specific zones in a specific region for a specific cloud infra provider, but the problem with your setup as-is is that the point of availability zones is ... guaranteed availability per zone. If one zone goes down, this leaves your cluster scrambling to rebalance itself as half or a third of it is instantly taken out of action (depending if you spread it over 2 or 3 AZs). This is the basic assumption behind AZs - that one can and will go down at some point and we want to continue operating when that happens. If you want to increase the availability of some critical data, I would run a different cluster in each zone and use Cross Cluster Replication if the goal was higher redundancy.

To be honest if you really need the data replicated across zones, I'd just use Elasticsearch Service on Elastic Cloud which supports Google Cloud as the underlying provider and allows you to specify number of availability zones by just moving a slider and it takes care of replicating the data.

CCR and Elasticsearch Service are both paid. If I needed to do what you're trying on a budget, I'd first reconsider if I really need guaranteed availability for this data, or is it OK to just make regular snapshots and restore the cluster in another AZ manually or automatically if it becomes unavailable. You could also theoretically spin up a cluster in each AZ without Cross Cluster Replication, then make the app(s) write to all of them. There are big potential data consistency (& loss) problems with this approach and you will end up essentially trying to reimplement what Elasticsearch Service's orchestration layer does for you. It's generally a hard problem.

amanjain · April 30, 2020, 5:24pm

Thank you so much @Emanuil. I am deploying 2-3 cluster and writing replicated data to clusters.

I want to evaluate if deploying master node in different zone than data node will affect latency or not. Can you guide me how to evaluate and the reason for increase in latency.

Emanuil · May 1, 2020, 9:39am

Just a very simple and quick experiment first - what's the ping time between virtual machines (or GKE containers, whatever you're using) between these zones? If the ping is unusually high, there's not much you can do in Elasticsearch.

system · May 29, 2020, 9:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch high availability across availability zones Elasticsearch	12	2572	June 22, 2020
2 Availability Zones in cluster Elasticsearch	3	1580	November 6, 2018
Elastic Cloud availability zones Elasticsearch	4	443	September 3, 2020
ElasticSearch and geo-redundancy Elasticsearch	7	3266	July 5, 2017
Multi-region Elastic Cloud cluster? Elasticsearch	2	1664	February 18, 2019

Elasticsearch in different zone

Related topics