Scaling elasticsearch queries


(Milad Fatenejad) #1

Hello:

I had a question related to handling increased query demand. Let's say I
have an index which stores data for a single month which is queried at some
rate. Then a few months later, due to some kind of external event, there is
a substantially increased interest in searching this data. How do I scale
up the index to handle the larger number of queries? At this point I am not
indexing any new documents, just searching. I know that I can add new nodes
and shards will be moved there, but this stops working once you have 1
shard per node.

At this point, what do I do? Do I increase the number of replicas? If so,
is it easy to later (after the demand drops) decrease the number of
replicas?

Thank You
Milad

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #2

Yes. Exactly. You will perhaps have to provide more machines as well to hold all your replicas.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 24 oct. 2013 à 19:51, Milad Fatenejad icksa1@gmail.com a écrit :

Hello:

I had a question related to handling increased query demand. Let's say I have an index which stores data for a single month which is queried at some rate. Then a few months later, due to some kind of external event, there is a substantially increased interest in searching this data. How do I scale up the index to handle the larger number of queries? At this point I am not indexing any new documents, just searching. I know that I can add new nodes and shards will be moved there, but this stops working once you have 1 shard per node.

At this point, what do I do? Do I increase the number of replicas? If so, is it easy to later (after the demand drops) decrease the number of replicas?

Thank You
Milad

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Milad Fatenejad) #3

Thank you Dave, one quick follow up question. We have two data centers in
our cloud and I would like to have the primary shard and replica split
across these data centers. I read the following page:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness

and I believe the location awareness settings will accomplish this, for
example I can set:

cluster.routing.allocation.awareness.attributes: datacenter

If I add a replica (so my replica count is 2), what will happen with the
location awareness? The documentation seems to only say "that a shard and
its replica won't share the same" datacenter, so what happens if there are
only 2 data centers, but 3 copies of the data (i.e. replica count = 2)?
Will all replicas move to one datacenter? Or will elasticsearch try to
spread things around as much as possible in some way?

Thanks again!
Milad

On Thu, Oct 24, 2013 at 3:44 PM, David Pilato david@pilato.fr wrote:

Yes. Exactly. You will perhaps have to provide more machines as well to
hold all your replicas.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 24 oct. 2013 à 19:51, Milad Fatenejad icksa1@gmail.com a écrit :

Hello:

I had a question related to handling increased query demand. Let's say I
have an index which stores data for a single month which is queried at some
rate. Then a few months later, due to some kind of external event, there is
a substantially increased interest in searching this data. How do I scale
up the index to handle the larger number of queries? At this point I am not
indexing any new documents, just searching. I know that I can add new nodes
and shards will be moved there, but this stops working once you have 1
shard per node.

At this point, what do I do? Do I increase the number of replicas? If so,
is it easy to later (after the demand drops) decrease the number of
replicas?

Thank You
Milad

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/2a1WGHDKHfY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #4

Ha! Good question. I'm not sure but I guess it won't be allocated anywhere.

That said, unless you have a very low latency network between the two datacenters, I would not recommend to use that architecture.
I would create two clusters and push documents from client application in both clusters.

If network connectivity is good enough and guaranteed, then you can probably go for it but keep in mind that you will probably to think carefully of split brain issues when datacenters are disconnected.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 24 oct. 2013 à 23:07, Milad Fatenejad icksa1@gmail.com a écrit :

Thank you Dave, one quick follow up question. We have two data centers in our cloud and I would like to have the primary shard and replica split across these data centers. I read the following page:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness

and I believe the location awareness settings will accomplish this, for example I can set:
cluster.routing.allocation.awareness.attributes: datacenter
If I add a replica (so my replica count is 2), what will happen with the location awareness? The documentation seems to only say "that a shard and its replica won't share the same" datacenter, so what happens if there are only 2 data centers, but 3 copies of the data (i.e. replica count = 2)? Will all replicas move to one datacenter? Or will elasticsearch try to spread things around as much as possible in some way?

Thanks again!
Milad

On Thu, Oct 24, 2013 at 3:44 PM, David Pilato david@pilato.fr wrote:
Yes. Exactly. You will perhaps have to provide more machines as well to hold all your replicas.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 24 oct. 2013 à 19:51, Milad Fatenejad icksa1@gmail.com a écrit :

Hello:

I had a question related to handling increased query demand. Let's say I have an index which stores data for a single month which is queried at some rate. Then a few months later, due to some kind of external event, there is a substantially increased interest in searching this data. How do I scale up the index to handle the larger number of queries? At this point I am not indexing any new documents, just searching. I know that I can add new nodes and shards will be moved there, but this stops working once you have 1 shard per node.

At this point, what do I do? Do I increase the number of replicas? If so, is it easy to later (after the demand drops) decrease the number of replicas?

Thank You
Milad

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/2a1WGHDKHfY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Milad Fatenejad) #5

Thank you David, this has been very helpful.

Milad

On Fri, Oct 25, 2013 at 2:59 AM, David Pilato david@pilato.fr wrote:

Ha! Good question. I'm not sure but I guess it won't be allocated anywhere.

That said, unless you have a very low latency network between the two
datacenters, I would not recommend to use that architecture.
I would create two clusters and push documents from client application in
both clusters.

If network connectivity is good enough and guaranteed, then you can
probably go for it but keep in mind that you will probably to think
carefully of split brain issues when datacenters are disconnected.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 24 oct. 2013 à 23:07, Milad Fatenejad icksa1@gmail.com a écrit :

Thank you Dave, one quick follow up question. We have two data centers in
our cloud and I would like to have the primary shard and replica split
across these data centers. I read the following page:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness

and I believe the location awareness settings will accomplish this, for
example I can set:

cluster.routing.allocation.awareness.attributes: datacenter

If I add a replica (so my replica count is 2), what will happen with the
location awareness? The documentation seems to only say "that a shard and
its replica won't share the same" datacenter, so what happens if there are
only 2 data centers, but 3 copies of the data (i.e. replica count = 2)?
Will all replicas move to one datacenter? Or will elasticsearch try to
spread things around as much as possible in some way?

Thanks again!
Milad

On Thu, Oct 24, 2013 at 3:44 PM, David Pilato david@pilato.fr wrote:

Yes. Exactly. You will perhaps have to provide more machines as well to
hold all your replicas.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 24 oct. 2013 à 19:51, Milad Fatenejad icksa1@gmail.com a écrit :

Hello:

I had a question related to handling increased query demand. Let's say I
have an index which stores data for a single month which is queried at some
rate. Then a few months later, due to some kind of external event, there is
a substantially increased interest in searching this data. How do I scale
up the index to handle the larger number of queries? At this point I am not
indexing any new documents, just searching. I know that I can add new nodes
and shards will be moved there, but this stops working once you have 1
shard per node.

At this point, what do I do? Do I increase the number of replicas? If so,
is it easy to later (after the demand drops) decrease the number of
replicas?

Thank You
Milad

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/2a1WGHDKHfY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/2a1WGHDKHfY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6