Dealing with latency when indexing


(Dustin Lashmar) #1

Hi all,

I'm investigating setting up an Elasticsearch cluster that spans multiple
regions (possibly ec2 regions, but possibly not), and I'm anticipating a
fair bit of latency between them.
I think it makes sense to use
cluster.routing.allocation.awareness.attributes and
cluster.routing.allocation.awareness.force.region.values and setting the
number of replicas = number of regions - 1. That way there will be a full
copy of the data in each region (right?)
I'm also configuring the client nodes with
cluster.routing.allocation.awareness.attributes so they should always hit
nodes in the same region if they are available.
This is awesome because it means full fail-over if a region goes down, and
also means that searches won't have to leave the region they started from,
avoiding the latency (unless nodes in that region go down, but that's ok).

The only issue is when it comes to indexing documents, my understanding
(and correct me if I'm wrong) is that docs being indexed will need to first
be indexed on the primary shard, then go to the replicas. So if the primary
is not in your region it will take at least 2*latency before you see the
document in your region.

So is there a way to make sure that each region always has at least one
primary shard? And to route new documents to that particular shard? I
thought
http://www.elasticsearch.org/guide/reference/api/admin-cluster-reroute.html
might help, but as far as I can tell I can't use it to switch a shard to
primary status.

Or perhaps my whole approach is wrong, does anyone have strategies for
dealing with high latency between sections of a cluster?

Thanks in advance,

Dustin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Neil Andrassy) #2

Hi Dustin,

Did you get very far on this or find any useful info elsewhere? We're
building out a similar multi-site cluster.

Thanks,

Neil

On Thursday, 14 March 2013 02:43:49 UTC, Dustin Lashmar wrote:

Hi all,

I'm investigating setting up an Elasticsearch cluster that spans multiple
regions (possibly ec2 regions, but possibly not), and I'm anticipating a
fair bit of latency between them.
I think it makes sense to use
cluster.routing.allocation.awareness.attributes and
cluster.routing.allocation.awareness.force.region.values and setting the
number of replicas = number of regions - 1. That way there will be a full
copy of the data in each region (right?)
I'm also configuring the client nodes with
cluster.routing.allocation.awareness.attributes so they should always hit
nodes in the same region if they are available.
This is awesome because it means full fail-over if a region goes down, and
also means that searches won't have to leave the region they started from,
avoiding the latency (unless nodes in that region go down, but that's ok).

The only issue is when it comes to indexing documents, my understanding
(and correct me if I'm wrong) is that docs being indexed will need to first
be indexed on the primary shard, then go to the replicas. So if the primary
is not in your region it will take at least 2*latency before you see the
document in your region.

So is there a way to make sure that each region always has at least one
primary shard? And to route new documents to that particular shard? I
thought
http://www.elasticsearch.org/guide/reference/api/admin-cluster-reroute.htmlmight help, but as far as I can tell I can't use it to switch a shard to
primary status.

Or perhaps my whole approach is wrong, does anyone have strategies for
dealing with high latency between sections of a cluster?

Thanks in advance,

Dustin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Dustin Lashmar-2) #3

Hi Neil,

I didn't get much further with this as we found a single region was enough
for now. From what I've read though most people recommend against using a
single cluster that spans multiple regions and instead setting up a cluster
per region and using other techniques to keep them in synch. How exactly
you'd go about that I don't really know, I never really got that far with
it.

Sorry I couldn't help more, let me know if you find a nice solution,

Dustin

On Friday, August 30, 2013 7:19:28 PM UTC+10, Neil Andrassy wrote:

Hi Dustin,

Did you get very far on this or find any useful info elsewhere? We're
building out a similar multi-site cluster.

Thanks,

Neil

On Thursday, 14 March 2013 02:43:49 UTC, Dustin Lashmar wrote:

Hi all,

I'm investigating setting up an Elasticsearch cluster that spans multiple
regions (possibly ec2 regions, but possibly not), and I'm anticipating a
fair bit of latency between them.
I think it makes sense to use
cluster.routing.allocation.awareness.attributes and
cluster.routing.allocation.awareness.force.region.values and setting the
number of replicas = number of regions - 1. That way there will be a full
copy of the data in each region (right?)
I'm also configuring the client nodes with
cluster.routing.allocation.awareness.attributes so they should always hit
nodes in the same region if they are available.
This is awesome because it means full fail-over if a region goes down,
and also means that searches won't have to leave the region they started
from, avoiding the latency (unless nodes in that region go down, but that's
ok).

The only issue is when it comes to indexing documents, my understanding
(and correct me if I'm wrong) is that docs being indexed will need to first
be indexed on the primary shard, then go to the replicas. So if the primary
is not in your region it will take at least 2*latency before you see the
document in your region.

So is there a way to make sure that each region always has at least one
primary shard? And to route new documents to that particular shard? I
thought
http://www.elasticsearch.org/guide/reference/api/admin-cluster-reroute.htmlmight help, but as far as I can tell I can't use it to switch a shard to
primary status.

Or perhaps my whole approach is wrong, does anyone have strategies for
dealing with high latency between sections of a cluster?

Thanks in advance,

Dustin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Brian Yoder) #4

I suspect that the immediate answer is to implement two independent
clusters, one in each region.

Then create a durable queue in each cluster. Make sure the backing file
store for that queue is properly replicated (for example, iSCSI or NetApp
or something similar).

Then each consumer is a Java application that consumes a request from the
queue and sends it to the other cluster via the TransportClient class.
Defining the TransportClient with all of the addresses of the nodes in the
remote cluster will handle the needed failover.

Then ensure that the application can tolerate the intra-cluster latency.
For example, if the link between regions is broken, the clusters can
operate independently and queue updates to the other. The application must
be aware of this and treat it as a normal part of its operation. This is
easier said than done, but for a database with high update rates and that
is split across two faraway regions, this is the simplest option which
means it's the easiest to create and test.

On the other hand, putting the database into one highly available cloud
(for example, Amazon) is a single-region solution that follows Mark Twain's
advice to "put all of your eggs in one basket, and then watch that
basket!". Works very well most of the time, but network or other service
outages must be factored in, such as the recent Amazon AWS outage.

On Sunday, September 1, 2013 7:24:40 PM UTC-4, Dustin Lashmar wrote:

Hi Neil,

I didn't get much further with this as we found a single region was enough
for now. From what I've read though most people recommend against using a
single cluster that spans multiple regions and instead setting up a cluster
per region and using other techniques to keep them in synch. How exactly
you'd go about that I don't really know, I never really got that far with
it.

Sorry I couldn't help more, let me know if you find a nice solution,

Dustin

On Friday, August 30, 2013 7:19:28 PM UTC+10, Neil Andrassy wrote:

Hi Dustin,

Did you get very far on this or find any useful info elsewhere? We're
building out a similar multi-site cluster.

Thanks,

Neil

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #5

A simple mirror cluster indexing technique is firing up two (or more)
TransportClients and indexing to the clusters in sync.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6