Hi all,
I'm investigating setting up an Elasticsearch cluster that spans multiple
regions (possibly ec2 regions, but possibly not), and I'm anticipating a
fair bit of latency between them.
I think it makes sense to use
cluster.routing.allocation.awareness.attributes and
cluster.routing.allocation.awareness.force.region.values and setting the
number of replicas = number of regions - 1. That way there will be a full
copy of the data in each region (right?)
I'm also configuring the client nodes with
cluster.routing.allocation.awareness.attributes so they should always hit
nodes in the same region if they are available.
This is awesome because it means full fail-over if a region goes down, and
also means that searches won't have to leave the region they started from,
avoiding the latency (unless nodes in that region go down, but that's ok).
The only issue is when it comes to indexing documents, my understanding
(and correct me if I'm wrong) is that docs being indexed will need to first
be indexed on the primary shard, then go to the replicas. So if the primary
is not in your region it will take at least 2*latency before you see the
document in your region.
So is there a way to make sure that each region always has at least one
primary shard? And to route new documents to that particular shard? I
thought
http://www.elasticsearch.org/guide/reference/api/admin-cluster-reroute.html
might help, but as far as I can tell I can't use it to switch a shard to
primary status.
Or perhaps my whole approach is wrong, does anyone have strategies for
dealing with high latency between sections of a cluster?
Thanks in advance,
Dustin
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.