Using ES in a dynamic EC2 environment

James_Cook · July 18, 2010, 2:16am

I am preparing to roll out a web application for QA testing on EC2. This web
application contains an embedded ES server. Scalr is used to spin up new
instances when load increases, and tears down instances when the load
decreases. I have a couple questions regarding ES and its operation in this
environment.

How many shards and replicas should I choose for this environment? I
can restrict the minimum and maximum number of EC2 instances, but I would
prefer to leave the maximum open ended. I plan to set the minimum number of
instances to two or three mostly to ensure that most of our data is cached
in memory by Hazelcast which is acting as a IMDB in front of Elastic Search.
Is the shard count fixed? Can it be set dynamically based on the
number of instances I have running? If not, is it detrimental to specify a
very high number of shards (like 30 or more to cover cases if 10-15
instances of the web app are running), and what happens if I have that
number of shards and only two instances?

James_Cook · July 24, 2010, 3:05pm

My apologies for bumping this, but it is important to me.

On Sat, Jul 17, 2010 at 10:16 PM, James Cook jcook@tracermedia.com wrote:

I am preparing to roll out a web application for QA testing on EC2. This
web application contains an embedded ES server. Scalr is used to spin up new
instances when load increases, and tears down instances when the load
decreases. I have a couple questions regarding ES and its operation in this
environment.

How many shards and replicas should I choose for this environment? I
can restrict the minimum and maximum number of EC2 instances, but I would
prefer to leave the maximum open ended. I plan to set the minimum number of
instances to two or three mostly to ensure that most of our data is cached
in memory by Hazelcast which is acting as a IMDB in front of Elastic Search.

Is the shard count fixed? Can it be set dynamically based on the
number of instances I have running? If not, is it detrimental to specify a
very high number of shards (like 30 or more to cover cases if 10-15
instances of the web app are running), and what happens if I have that
number of shards and only two instances?

Berkay_Mollamustafao · July 24, 2010, 7:55pm

James,
I'll give it a go to get things started

How many shards and replicas should I choose for this environment?
There is no way to answer this generically, it depends on how many documents
you need to index, how fast you need to index them, how often they change,
how many users will execute queries etc. As a general rule of thumb, more
shards to increase indexing (write) performance, and more replicas increase
search/read performance. Having higher number of machines would increase the
performance. ES adjusts the distribution of shards and replicas dynamically
as you add more machines.
Shard count is currently fixed (cannot be changed once index is created).
Having very high number of shards would introduce an overhead, so may not be
a good idea. Another option would be creating new indices with higher number
of shards when you need to and reindexing the data as. You can use index
labels to switch to new index once reindexing is complete, etc.
You can also consider using larger EC2 instances to scale up rather than
additional instances. For example, you can set to 5 shard and 1 replica and
use 5 small EC2 instances as the low point and switch to bigger instances as
needed by adding bigger ones and removing the smalller ones gradually.

Mostly I think you'll need to be able to profile the app somewhat first
before setting up such an automated system to scale up and down. My guess is
that the way that ES works, it would be computationally expensive to change
number of shards dynamically (essentially it would likely require
re-indexing of the whole data), hence you may need to think other ways of
scaling up and down automatically.

Hope this helps..

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sat, Jul 24, 2010 at 11:05 AM, James Cook jcook@tracermedia.com wrote:

My apologies for bumping this, but it is important to me.

On Sat, Jul 17, 2010 at 10:16 PM, James Cook jcook@tracermedia.comwrote:

I am preparing to roll out a web application for QA testing on EC2. This
web application contains an embedded ES server. Scalr is used to spin up new
instances when load increases, and tears down instances when the load
decreases. I have a couple questions regarding ES and its operation in this
environment.

How many shards and replicas should I choose for this environment?
I can restrict the minimum and maximum number of EC2 instances, but I would
prefer to leave the maximum open ended. I plan to set the minimum number of
instances to two or three mostly to ensure that most of our data is cached
in memory by Hazelcast which is acting as a IMDB in front of Elastic Search.

Is the shard count fixed? Can it be set dynamically based on the
number of instances I have running? If not, is it detrimental to specify a
very high number of shards (like 30 or more to cover cases if 10-15
instances of the web app are running), and what happens if I have that
number of shards and only two instances?

James_Cook · July 24, 2010, 9:15pm

Thanks for that information.

I am using an application architecture with the application doing all Crate,
Update, Delete functionality via an IMBD, and using Elastic Search for
queries on anything other than PK and also serving as the persistence
mechanism for the IMDB.

+-----------------------------------------------------------+
| Tomcat Application Server |
| +------------------------------------------------------+ |
| | Web App | |
| +------------------------------------------------------+ |
| | (Create, PK Lookup, | | |
| V and Delete) V | (Queries for |
| +-------------------------------+ | objects using |
| | Hazelcast IMDB (Autocluster) | | parameters |
| +-------------------------------+ | other than |
| | (write behind) | primary key) |
| V V |
| +------------------------------------------------------+ |
| | Elastic Search (Autocluster) | |
| +------------------------------------------------------+ |
+-----------------------------------------------------------+

This is what each of my EC2 instances look like, and they are load balanced
using AWS. Scalr monitors the load on each server to determine when
instances will be created or destroyed.

Scalability testing is scheduled for three weeks from now, so it will be
interesting to see this functioning in the wild.

I'd be very interested in other's "real world" experiences using ES on EC2.

-- jim

On Sat, Jul 24, 2010 at 3:55 PM, Berkay Mollamustafaoglu
mberkay@gmail.comwrote:

James,
I'll give it a go to get things started

How many shards and replicas should I choose for this environment?
There is no way to answer this generically, it depends on how many
documents you need to index, how fast you need to index them, how often they
change, how many users will execute queries etc. As a general rule of
thumb, more shards to increase indexing (write) performance, and more
replicas increase search/read performance. Having higher number of machines
would increase the performance. ES adjusts the distribution of shards and
replicas dynamically as you add more machines.

Shard count is currently fixed (cannot be changed once index is created).
Having very high number of shards would introduce an overhead, so may not be
a good idea. Another option would be creating new indices with higher number
of shards when you need to and reindexing the data as. You can use index
labels to switch to new index once reindexing is complete, etc.
You can also consider using larger EC2 instances to scale up rather than
additional instances. For example, you can set to 5 shard and 1 replica and
use 5 small EC2 instances as the low point and switch to bigger instances as
needed by adding bigger ones and removing the smalller ones gradually.

Mostly I think you'll need to be able to profile the app somewhat first
before setting up such an automated system to scale up and down. My guess is
that the way that ES works, it would be computationally expensive to change
number of shards dynamically (essentially it would likely require
re-indexing of the whole data), hence you may need to think other ways of
scaling up and down automatically.

Hope this helps..

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sat, Jul 24, 2010 at 11:05 AM, James Cook jcook@tracermedia.comwrote:

My apologies for bumping this, but it is important to me.

On Sat, Jul 17, 2010 at 10:16 PM, James Cook jcook@tracermedia.comwrote:

I am preparing to roll out a web application for QA testing on EC2. This
web application contains an embedded ES server. Scalr is used to spin up new
instances when load increases, and tears down instances when the load
decreases. I have a couple questions regarding ES and its operation in this
environment.

How many shards and replicas should I choose for this environment?
I can restrict the minimum and maximum number of EC2 instances, but I would
prefer to leave the maximum open ended. I plan to set the minimum number of
instances to two or three mostly to ensure that most of our data is cached
in memory by Hazelcast which is acting as a IMDB in front of Elastic Search.

Is the shard count fixed? Can it be set dynamically based on the
number of instances I have running? If not, is it detrimental to specify a
very high number of shards (like 30 or more to cover cases if 10-15
instances of the web app are running), and what happens if I have that
number of shards and only two instances?

kimchy · July 25, 2010, 6:45pm

Hi,

Allow me to expand a bit on the answer above:

The number of shards in elasticsearch per index is fixed. Each shard has
0 or more replicas. When an index is created, all shards are allocated to
the nodes (so you might end up with shard 1 and shard 2 allocated to the
same node). A shard and its replica are not allocated to the same node.

You maximum scaling out size (per index) is determined by number_of_shards

(number_of_replicas + 1). This means that for the default settings for 5
shards, each with one replica, you will reach the scaling limit of that
specific index with 10 machines.

Regarding sizing, thats a good question. Lets start with the number of
shards. The more shards you have, the better indexing TPS you will get
(assuming have enough machines for them to spread properly). The more shards
you have also mean more shards to "hit" when you do a search (since search
is distributed across shards). This can be a good thing since if you have
heavy search operations, you distribute the load, though the fact that more
shards need to process a search request might result in higher latency
(note, async IO is used for this, so no blocking threads or anything like
that).

The more replicas you have increate your high availability factor, and
since shard replicas participate in search requests, should spread the load
of search operations. On the other hand, when indexing, it means that the
indexing of a document will happen on a shard and all its replicas. Though
note that replication is done in parallel (so the difference between 1
replica and 2 replica is negligible (assuming you have enough machines to
spread the shards).

The number of shards is fixed. The ability to change the number of shards
will probably not happen in the near future, and it is problematic, since
with how things work (and with how search engines in general work), it will
mean complete reindexing of substantial part of your data. Changing the
number of replicas (in runtime) is something that can certainly be
implemented and is planned to be implemented.

To alleviate the problem of number of shards being fixed, one of the
benefits of elasticsearch is the fact that the number of indices is not.
And, since you can search on more than one index, you can devise a system
where indices are added dynamically. For example, for a log aggregation
application, you can decide to create an index per month. This allows you to
scale out indefinitely.

-shay.banon

On Sun, Jul 25, 2010 at 12:15 AM, James Cook jcook@tracermedia.com wrote:

Thanks for that information.

I am using an application architecture with the application doing all
Crate, Update, Delete functionality via an IMBD, and using Elastic Search
for queries on anything other than PK and also serving as the persistence
mechanism for the IMDB.

+-----------------------------------------------------------+
| Tomcat Application Server |
| +------------------------------------------------------+ |
| | Web App | |
| +------------------------------------------------------+ |
| | (Create, PK Lookup, | | |
| V and Delete) V | (Queries for |
| +-------------------------------+ | objects using |
| | Hazelcast IMDB (Autocluster) | | parameters |
| +-------------------------------+ | other than |
| | (write behind) | primary key) |
| V V |
| +------------------------------------------------------+ |
| | Elastic Search (Autocluster) | |
| +------------------------------------------------------+ |
+-----------------------------------------------------------+

This is what each of my EC2 instances look like, and they are load balanced
using AWS. Scalr monitors the load on each server to determine when
instances will be created or destroyed.

Scalability testing is scheduled for three weeks from now, so it will be
interesting to see this functioning in the wild.

I'd be very interested in other's "real world" experiences using ES on
EC2.

-- jim

On Sat, Jul 24, 2010 at 3:55 PM, Berkay Mollamustafaoglu <
mberkay@gmail.com> wrote:

James,
I'll give it a go to get things started

How many shards and replicas should I choose for this environment?
There is no way to answer this generically, it depends on how many
documents you need to index, how fast you need to index them, how often they
change, how many users will execute queries etc. As a general rule of
thumb, more shards to increase indexing (write) performance, and more
replicas increase search/read performance. Having higher number of machines
would increase the performance. ES adjusts the distribution of shards and
replicas dynamically as you add more machines.

Shard count is currently fixed (cannot be changed once index is
created). Having very high number of shards would introduce an overhead, so
may not be a good idea. Another option would be creating new indices with
higher number of shards when you need to and reindexing the data as. You can
use index labels to switch to new index once reindexing is complete, etc.
You can also consider using larger EC2 instances to scale up rather than
additional instances. For example, you can set to 5 shard and 1 replica and
use 5 small EC2 instances as the low point and switch to bigger instances as
needed by adding bigger ones and removing the smalller ones gradually.

Mostly I think you'll need to be able to profile the app somewhat first
before setting up such an automated system to scale up and down. My guess is
that the way that ES works, it would be computationally expensive to change
number of shards dynamically (essentially it would likely require
re-indexing of the whole data), hence you may need to think other ways of
scaling up and down automatically.

Hope this helps..

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sat, Jul 24, 2010 at 11:05 AM, James Cook jcook@tracermedia.comwrote:

My apologies for bumping this, but it is important to me.

On Sat, Jul 17, 2010 at 10:16 PM, James Cook jcook@tracermedia.comwrote:

I am preparing to roll out a web application for QA testing on EC2. This
web application contains an embedded ES server. Scalr is used to spin up new
instances when load increases, and tears down instances when the load
decreases. I have a couple questions regarding ES and its operation in this
environment.

How many shards and replicas should I choose for this
environment? I can restrict the minimum and maximum number of EC2 instances,
but I would prefer to leave the maximum open ended. I plan to set the
minimum number of instances to two or three mostly to ensure that most of
our data is cached in memory by Hazelcast which is acting as a IMDB in front
of Elastic Search.

Is the shard count fixed? Can it be set dynamically based on the
number of instances I have running? If not, is it detrimental to specify a
very high number of shards (like 30 or more to cover cases if 10-15
instances of the web app are running), and what happens if I have that
number of shards and only two instances?

James_Cook · July 25, 2010, 7:39pm

Thanks, that is a nice bit of information, and gives me nearly all the
information I need for our stress tests.

One question I have, is what happens when I add that 11th node to a 5
shard/1 replica configuration? Does it participate as a client/no data node?

-- jim

On Sun, Jul 25, 2010 at 2:45 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Hi,

Allow me to expand a bit on the answer above:

The number of shards in elasticsearch per index is fixed. Each shard
has 0 or more replicas. When an index is created, all shards are allocated
to the nodes (so you might end up with shard 1 and shard 2 allocated to the
same node). A shard and its replica are not allocated to the same node.

You maximum scaling out size (per index) is determined by
number_of_shards * (number_of_replicas + 1). This means that for the default
settings for 5 shards, each with one replica, you will reach the scaling
limit of that specific index with 10 machines.

Regarding sizing, thats a good question. Lets start with the number of
shards. The more shards you have, the better indexing TPS you will get
(assuming have enough machines for them to spread properly). The more shards
you have also mean more shards to "hit" when you do a search (since search
is distributed across shards). This can be a good thing since if you have
heavy search operations, you distribute the load, though the fact that more
shards need to process a search request might result in higher latency
(note, async IO is used for this, so no blocking threads or anything like
that).

The more replicas you have increate your high availability factor, and
since shard replicas participate in search requests, should spread the load
of search operations. On the other hand, when indexing, it means that the
indexing of a document will happen on a shard and all its replicas. Though
note that replication is done in parallel (so the difference between 1
replica and 2 replica is negligible (assuming you have enough machines to
spread the shards).

The number of shards is fixed. The ability to change the number of
shards will probably not happen in the near future, and it is problematic,
since with how things work (and with how search engines in general work), it
will mean complete reindexing of substantial part of your data. Changing the
number of replicas (in runtime) is something that can certainly be
implemented and is planned to be implemented.
To alleviate the problem of number of shards being fixed, one of the
benefits of elasticsearch is the fact that the number of indices is not.
And, since you can search on more than one index, you can devise a system
where indices are added dynamically. For example, for a log aggregation
application, you can decide to create an index per month. This allows you to
scale out indefinitely.

-shay.banon

On Sun, Jul 25, 2010 at 12:15 AM, James Cook jcook@tracermedia.comwrote:

Thanks for that information.

I am using an application architecture with the application doing all
Crate, Update, Delete functionality via an IMBD, and using Elastic Search
for queries on anything other than PK and also serving as the persistence
mechanism for the IMDB.

+-----------------------------------------------------------+
| Tomcat Application Server |
| +------------------------------------------------------+ |
| | Web App | |
| +------------------------------------------------------+ |
| | (Create, PK Lookup, | | |
| V and Delete) V | (Queries for |
| +-------------------------------+ | objects using |
| | Hazelcast IMDB (Autocluster) | | parameters |
| +-------------------------------+ | other than |
| | (write behind) | primary key) |
| V V |
| +------------------------------------------------------+ |
| | Elastic Search (Autocluster) | |
| +------------------------------------------------------+ |
+-----------------------------------------------------------+

This is what each of my EC2 instances look like, and they are load
balanced using AWS. Scalr monitors the load on each server to determine when
instances will be created or destroyed.

Scalability testing is scheduled for three weeks from now, so it will be
interesting to see this functioning in the wild.

I'd be very interested in other's "real world" experiences using ES on
EC2.

-- jim

On Sat, Jul 24, 2010 at 3:55 PM, Berkay Mollamustafaoglu <
mberkay@gmail.com> wrote:

James,
I'll give it a go to get things started

How many shards and replicas should I choose for this environment?
There is no way to answer this generically, it depends on how many
documents you need to index, how fast you need to index them, how often they
change, how many users will execute queries etc. As a general rule of
thumb, more shards to increase indexing (write) performance, and more
replicas increase search/read performance. Having higher number of machines
would increase the performance. ES adjusts the distribution of shards and
replicas dynamically as you add more machines.

Shard count is currently fixed (cannot be changed once index is
created). Having very high number of shards would introduce an overhead, so
may not be a good idea. Another option would be creating new indices with
higher number of shards when you need to and reindexing the data as. You can
use index labels to switch to new index once reindexing is complete, etc.
You can also consider using larger EC2 instances to scale up rather than
additional instances. For example, you can set to 5 shard and 1 replica and
use 5 small EC2 instances as the low point and switch to bigger instances as
needed by adding bigger ones and removing the smalller ones gradually.

Mostly I think you'll need to be able to profile the app somewhat first
before setting up such an automated system to scale up and down. My guess is
that the way that ES works, it would be computationally expensive to change
number of shards dynamically (essentially it would likely require
re-indexing of the whole data), hence you may need to think other ways of
scaling up and down automatically.

Hope this helps..

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sat, Jul 24, 2010 at 11:05 AM, James Cook jcook@tracermedia.comwrote:

My apologies for bumping this, but it is important to me.

On Sat, Jul 17, 2010 at 10:16 PM, James Cook jcook@tracermedia.comwrote:

I am preparing to roll out a web application for QA testing on EC2.
This web application contains an embedded ES server. Scalr is used to spin
up new instances when load increases, and tears down instances when the load
decreases. I have a couple questions regarding ES and its operation in this
environment.

How many shards and replicas should I choose for this
environment? I can restrict the minimum and maximum number of EC2 instances,
but I would prefer to leave the maximum open ended. I plan to set the
minimum number of instances to two or three mostly to ensure that most of
our data is cached in memory by Hazelcast which is acting as a IMDB in front
of Elastic Search.

Is the shard count fixed? Can it be set dynamically based on the
number of instances I have running? If not, is it detrimental to specify a
very high number of shards (like 30 or more to cover cases if 10-15
instances of the web app are running), and what happens if I have that
number of shards and only two instances?

kimchy · July 25, 2010, 7:43pm

If you allocate requests to it, then yes, it can act as a "router".

On Sun, Jul 25, 2010 at 10:39 PM, James Cook jcook@tracermedia.com wrote:

Thanks, that is a nice bit of information, and gives me nearly all the
information I need for our stress tests.

One question I have, is what happens when I add that 11th node to a 5
shard/1 replica configuration? Does it participate as a client/no data node?

-- jim

On Sun, Jul 25, 2010 at 2:45 PM, Shay Banon shay.banon@elasticsearch.comwrote:
Hi,

Allow me to expand a bit on the answer above:

The number of shards in elasticsearch per index is fixed. Each shard
has 0 or more replicas. When an index is created, all shards are allocated
to the nodes (so you might end up with shard 1 and shard 2 allocated to the
same node). A shard and its replica are not allocated to the same node.

You maximum scaling out size (per index) is determined by
number_of_shards * (number_of_replicas + 1). This means that for the default
settings for 5 shards, each with one replica, you will reach the scaling
limit of that specific index with 10 machines.

Regarding sizing, thats a good question. Lets start with the number of
shards. The more shards you have, the better indexing TPS you will get
(assuming have enough machines for them to spread properly). The more shards
you have also mean more shards to "hit" when you do a search (since search
is distributed across shards). This can be a good thing since if you have
heavy search operations, you distribute the load, though the fact that more
shards need to process a search request might result in higher latency
(note, async IO is used for this, so no blocking threads or anything like
that).

The more replicas you have increate your high availability factor, and
since shard replicas participate in search requests, should spread the load
of search operations. On the other hand, when indexing, it means that the
indexing of a document will happen on a shard and all its replicas. Though
note that replication is done in parallel (so the difference between 1
replica and 2 replica is negligible (assuming you have enough machines to
spread the shards).

The number of shards is fixed. The ability to change the number of
shards will probably not happen in the near future, and it is problematic,
since with how things work (and with how search engines in general work), it
will mean complete reindexing of substantial part of your data. Changing the
number of replicas (in runtime) is something that can certainly be
implemented and is planned to be implemented.
To alleviate the problem of number of shards being fixed, one of the
benefits of elasticsearch is the fact that the number of indices is not.
And, since you can search on more than one index, you can devise a system
where indices are added dynamically. For example, for a log aggregation
application, you can decide to create an index per month. This allows you to
scale out indefinitely.

-shay.banon

On Sun, Jul 25, 2010 at 12:15 AM, James Cook jcook@tracermedia.comwrote:

Thanks for that information.

I am using an application architecture with the application doing all
Crate, Update, Delete functionality via an IMBD, and using Elastic Search
for queries on anything other than PK and also serving as the persistence
mechanism for the IMDB.

+-----------------------------------------------------------+
| Tomcat Application Server |
| +------------------------------------------------------+ |
| | Web App | |
| +------------------------------------------------------+ |
| | (Create, PK Lookup, | | |
| V and Delete) V | (Queries for |
| +-------------------------------+ | objects using |
| | Hazelcast IMDB (Autocluster) | | parameters |
| +-------------------------------+ | other than |
| | (write behind) | primary key) |
| V V |
| +------------------------------------------------------+ |
| | Elastic Search (Autocluster) | |
| +------------------------------------------------------+ |
+-----------------------------------------------------------+

This is what each of my EC2 instances look like, and they are load
balanced using AWS. Scalr monitors the load on each server to determine when
instances will be created or destroyed.

Scalability testing is scheduled for three weeks from now, so it will be
interesting to see this functioning in the wild.

I'd be very interested in other's "real world" experiences using ES on
EC2.

-- jim

On Sat, Jul 24, 2010 at 3:55 PM, Berkay Mollamustafaoglu <
mberkay@gmail.com> wrote:

James,
I'll give it a go to get things started

How many shards and replicas should I choose for this environment?
There is no way to answer this generically, it depends on how many
documents you need to index, how fast you need to index them, how often they
change, how many users will execute queries etc. As a general rule of
thumb, more shards to increase indexing (write) performance, and more
replicas increase search/read performance. Having higher number of machines
would increase the performance. ES adjusts the distribution of shards and
replicas dynamically as you add more machines.

Shard count is currently fixed (cannot be changed once index is
created). Having very high number of shards would introduce an overhead, so
may not be a good idea. Another option would be creating new indices with
higher number of shards when you need to and reindexing the data as. You can
use index labels to switch to new index once reindexing is complete, etc.
You can also consider using larger EC2 instances to scale up rather than
additional instances. For example, you can set to 5 shard and 1 replica and
use 5 small EC2 instances as the low point and switch to bigger instances as
needed by adding bigger ones and removing the smalller ones gradually.

Mostly I think you'll need to be able to profile the app somewhat first
before setting up such an automated system to scale up and down. My guess is
that the way that ES works, it would be computationally expensive to change
number of shards dynamically (essentially it would likely require
re-indexing of the whole data), hence you may need to think other ways of
scaling up and down automatically.

Hope this helps..

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sat, Jul 24, 2010 at 11:05 AM, James Cook jcook@tracermedia.comwrote:

My apologies for bumping this, but it is important to me.

On Sat, Jul 17, 2010 at 10:16 PM, James Cook jcook@tracermedia.comwrote:

I am preparing to roll out a web application for QA testing on EC2.
This web application contains an embedded ES server. Scalr is used to spin
up new instances when load increases, and tears down instances when the load
decreases. I have a couple questions regarding ES and its operation in this
environment.

How many shards and replicas should I choose for this
environment? I can restrict the minimum and maximum number of EC2 instances,
but I would prefer to leave the maximum open ended. I plan to set the
minimum number of instances to two or three mostly to ensure that most of
our data is cached in memory by Hazelcast which is acting as a IMDB in front
of Elastic Search.

Is the shard count fixed? Can it be set dynamically based on
the number of instances I have running? If not, is it detrimental to specify
a very high number of shards (like 30 or more to cover cases if 10-15
instances of the web app are running), and what happens if I have that
number of shards and only two instances?

Topic		Replies	Views
On scaling Elasticsearch	10	882	July 6, 2017
Shards and replicas Elasticsearch	16	1643	July 6, 2017
Replication basics Elasticsearch	14	505	July 6, 2017
Choosing Shards and Replica's configuration values Elasticsearch	11	719	July 6, 2017
Slow Query Performance Elasticsearch	10	799	July 6, 2017

Using ES in a dynamic EC2 environment

Related topics