Upper bounds on the number of indexes in an elastic search cluster

Hey guys. We’re building a Multi tenant application, where users create
applications within our single server. For our current ES scheme, we're
building an index per application. Are there any stress tests or
documentation on the upper bounds of the number of indexes a cluster can
handle? From my current understanding of meta data and routing, ever node
caches the meta data of all the indexes and shards for routing. At some
point, this will obviously overwhelm the node. Is my current understanding
correct, or is this information partitioned across the cluster as well?

Thanks,

Todd

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Why do you want to create huge number of indexes on just a single node?

There are smarter methods to scale. Use over-allocation of shards. This is
explained by kimchy in this thread

http://elasticsearch-users.115913.n3.nabble.com/Over-allocation-of-shards-td3673978.html

TL;DR you can create many thousands of aliases on a single (or few) indices
with just a few shards. There is no limit defined by ES, when your
configuration / hardware capacity is exceeded, you will see the node
getting sluggish.

Jörg

On Fri, Sep 26, 2014 at 11:23 PM, Todd Nine tnine@apigee.com wrote:

Hey guys. We’re building a Multi tenant application, where users create
applications within our single server. For our current ES scheme, we're
building an index per application. Are there any stress tests or
documentation on the upper bounds of the number of indexes a cluster can
handle? From my current understanding of meta data and routing, ever node
caches the meta data of all the indexes and shards for routing. At some
point, this will obviously overwhelm the node. Is my current understanding
correct, or is this information partitioned across the cluster as well?

Thanks,

Todd

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHefoj7kkiLV_qPFpJ2tkPfHeORNNdHCD%2BvsZsZ6G4iCg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Jorg,
We're storing each application in it's own Index so we can manage it
independently of others. There's not set load or usage on our
applications. Some will be very small, a few hundred documents. Others
will be quite large, in the billions. We have no way of knowing what the
usage profile is. Rather, we're initially thinking that expansion will
occur using a combination of additional indexes and aliases referencing
those indexes. This allows us to automate the management of the aliases
and indexes, and in turn allows us to scale them to the needs of the
application without over allocating unused capacity. For instance, with
write heavy applications we can allocate more shards (via alias), for read
heavy application we can allocate more replicas.

We're not running our cluster on a single node. Our cluster is small to
begin with, it's 6 nodes in our current POC. Ultimately I expect us to
grow each cluster we stand up to 20 or so nodes. We'll expand as necessary
to support the number of shards and replicas and keep our performance up.
I'm not particularly worried about our ability to scale horizontally with
our hardware.

Rather, I'm concerned with how far can we scale on our number of indexes,
and how does that relate to the number of machines? When we keep adding
hardware, does this increase the upper bounds of the number of indexes we
can have? Not the physical shards and replicas, but the routing
information for the master of the shards and location of the replicas.
I've done distributed data storage for many years, and none of the
documentation on ES makes it clear if this becomes an issue operationally.
I'm leery to just assume it will "just work". When implementing something
like this, you either have to do a distributed tree for your meta data to
get the partitioning you need to scale infinitely, or every node must store
every shard's master information. How does it work in ES?

Thanks,
Todd

On Friday, September 26, 2014 4:29:53 PM UTC-6, Jörg Prante wrote:

Why do you want to create huge number of indexes on just a single node?

There are smarter methods to scale. Use over-allocation of shards. This is
explained by kimchy in this thread

http://elasticsearch-users.115913.n3.nabble.com/Over-allocation-of-shards-td3673978.html
http://www.google.com/url?q=http%3A%2F%2Felasticsearch-users.115913.n3.nabble.com%2FOver-allocation-of-shards-td3673978.html&sa=D&sntz=1&usg=AFQjCNEk7KTtpuEot3JtBBmMRMpH25vLDA

TL;DR you can create many thousands of aliases on a single (or few)
indices with just a few shards. There is no limit defined by ES, when your
configuration / hardware capacity is exceeded, you will see the node
getting sluggish.

Jörg

On Fri, Sep 26, 2014 at 11:23 PM, Todd Nine <tn...@apigee.com
<javascript:>> wrote:

Hey guys. We’re building a Multi tenant application, where users create
applications within our single server. For our current ES scheme, we're
building an index per application. Are there any stress tests or
documentation on the upper bounds of the number of indexes a cluster can
handle? From my current understanding of meta data and routing, ever node
caches the meta data of all the indexes and shards for routing. At some
point, this will obviously overwhelm the node. Is my current understanding
correct, or is this information partitioned across the cluster as well?

Thanks,

Todd

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/595db4ab-b2f3-4dfe-bbf0-e4c13926e75e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

If you consider tens of thousands of indices on tens of thousands of nodes,
and the master node is the only node that can write to the cluster state,
it will have lot of work to do to keep up with all cluster state updates.

When the rate of changes to the cluster state increases, the master node
will be challenged to propagate the state changes to all other nodes
reliably and fast enough. There is no "distributed tree" yet, for example,
there are no special "forwarder nodes" that communicate the cluster state
to a partitioned set of nodes.

See https://github.com/elasticsearch/elasticsearch/issues/6186

The cluster state is a big compressed JSON structure which also must fit
into the heap memory of master eligible nodes. ES also uses privileged
network traffic channels for cluster state communication to take precedence
over ordinary index/search messaging. But all these precautions may not be
enough at some point. You can observe the point by retrieving a growing
cluster state over the cluster state API and measure the size and time of
this request.

On the other hand, you can have calm cluster state and many thousand
indices when the type mappings are constant, no field updates occur, and no
nodes connect/disconnect. It all depends on the situation how you have to
operate the cluster. One important thing is to allocate enough resources to
master eligible data-less nodes so they are not hindered by extra
search/index load.

N.B. 20 nodes is not a big cluster. There are ES clusters of hundreds and
thousands of nodes. From my understanding, the communication of the master
and 20 nodes is not a serious problem. This becomes an issue at ~500-1000
nodes.

Jörg

On Sat, Sep 27, 2014 at 1:12 AM, Todd Nine tnine@apigee.com wrote:

Hi Jorg,
We're storing each application in it's own Index so we can manage it
independently of others. There's not set load or usage on our
applications. Some will be very small, a few hundred documents. Others
will be quite large, in the billions. We have no way of knowing what the
usage profile is. Rather, we're initially thinking that expansion will
occur using a combination of additional indexes and aliases referencing
those indexes. This allows us to automate the management of the aliases
and indexes, and in turn allows us to scale them to the needs of the
application without over allocating unused capacity. For instance, with
write heavy applications we can allocate more shards (via alias), for read
heavy application we can allocate more replicas.

We're not running our cluster on a single node. Our cluster is small to
begin with, it's 6 nodes in our current POC. Ultimately I expect us to
grow each cluster we stand up to 20 or so nodes. We'll expand as necessary
to support the number of shards and replicas and keep our performance up.
I'm not particularly worried about our ability to scale horizontally with
our hardware.

Rather, I'm concerned with how far can we scale on our number of indexes,
and how does that relate to the number of machines? When we keep adding
hardware, does this increase the upper bounds of the number of indexes we
can have? Not the physical shards and replicas, but the routing
information for the master of the shards and location of the replicas.
I've done distributed data storage for many years, and none of the
documentation on ES makes it clear if this becomes an issue operationally.
I'm leery to just assume it will "just work". When implementing something
like this, you either have to do a distributed tree for your meta data to
get the partitioning you need to scale infinitely, or every node must store
every shard's master information. How does it work in ES?

Thanks,
Todd

On Friday, September 26, 2014 4:29:53 PM UTC-6, Jörg Prante wrote:

Why do you want to create huge number of indexes on just a single node?

There are smarter methods to scale. Use over-allocation of shards. This
is explained by kimchy in this thread

http://elasticsearch-users.115913.n3.nabble.com/Over-
allocation-of-shards-td3673978.html
http://www.google.com/url?q=http%3A%2F%2Felasticsearch-users.115913.n3.nabble.com%2FOver-allocation-of-shards-td3673978.html&sa=D&sntz=1&usg=AFQjCNEk7KTtpuEot3JtBBmMRMpH25vLDA

TL;DR you can create many thousands of aliases on a single (or few)
indices with just a few shards. There is no limit defined by ES, when your
configuration / hardware capacity is exceeded, you will see the node
getting sluggish.

Jörg

On Fri, Sep 26, 2014 at 11:23 PM, Todd Nine tn...@apigee.com wrote:

Hey guys. We’re building a Multi tenant application, where users create
applications within our single server. For our current ES scheme, we're
building an index per application. Are there any stress tests or
documentation on the upper bounds of the number of indexes a cluster can
handle? From my current understanding of meta data and routing, ever node
caches the meta data of all the indexes and shards for routing. At some
point, this will obviously overwhelm the node. Is my current understanding
correct, or is this information partitioned across the cluster as well?

Thanks,

Todd

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/595db4ab-b2f3-4dfe-bbf0-e4c13926e75e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/595db4ab-b2f3-4dfe-bbf0-e4c13926e75e%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGULM1L_6s%3Dk7SAftA%3D0TM3Jx1haYddnAOLAuHa-4z%2BOw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

It sounds like we're going to need to test our upper bounds of indexes
(with no data) to see how many we can support. We may need to re-evaluate
our thoughts on an index per app. We might be better off doing a
statically sized set of indexes, then consistently hashing our applications
to those indexes. Thank you for your help!

On Fri, Sep 26, 2014 at 5:44 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

If you consider tens of thousands of indices on tens of thousands of
nodes, and the master node is the only node that can write to the cluster
state, it will have lot of work to do to keep up with all cluster state
updates.

When the rate of changes to the cluster state increases, the master node
will be challenged to propagate the state changes to all other nodes
reliably and fast enough. There is no "distributed tree" yet, for example,
there are no special "forwarder nodes" that communicate the cluster state
to a partitioned set of nodes.

See https://github.com/elasticsearch/elasticsearch/issues/6186

The cluster state is a big compressed JSON structure which also must fit
into the heap memory of master eligible nodes. ES also uses privileged
network traffic channels for cluster state communication to take precedence
over ordinary index/search messaging. But all these precautions may not be
enough at some point. You can observe the point by retrieving a growing
cluster state over the cluster state API and measure the size and time of
this request.

On the other hand, you can have calm cluster state and many thousand
indices when the type mappings are constant, no field updates occur, and no
nodes connect/disconnect. It all depends on the situation how you have to
operate the cluster. One important thing is to allocate enough resources to
master eligible data-less nodes so they are not hindered by extra
search/index load.

N.B. 20 nodes is not a big cluster. There are ES clusters of hundreds and
thousands of nodes. From my understanding, the communication of the master
and 20 nodes is not a serious problem. This becomes an issue at ~500-1000
nodes.

Jörg

On Sat, Sep 27, 2014 at 1:12 AM, Todd Nine tnine@apigee.com wrote:

Hi Jorg,
We're storing each application in it's own Index so we can manage it
independently of others. There's not set load or usage on our
applications. Some will be very small, a few hundred documents. Others
will be quite large, in the billions. We have no way of knowing what the
usage profile is. Rather, we're initially thinking that expansion will
occur using a combination of additional indexes and aliases referencing
those indexes. This allows us to automate the management of the aliases
and indexes, and in turn allows us to scale them to the needs of the
application without over allocating unused capacity. For instance, with
write heavy applications we can allocate more shards (via alias), for read
heavy application we can allocate more replicas.

We're not running our cluster on a single node. Our cluster is small to
begin with, it's 6 nodes in our current POC. Ultimately I expect us to
grow each cluster we stand up to 20 or so nodes. We'll expand as necessary
to support the number of shards and replicas and keep our performance up.
I'm not particularly worried about our ability to scale horizontally with
our hardware.

Rather, I'm concerned with how far can we scale on our number of indexes,
and how does that relate to the number of machines? When we keep adding
hardware, does this increase the upper bounds of the number of indexes we
can have? Not the physical shards and replicas, but the routing
information for the master of the shards and location of the replicas.
I've done distributed data storage for many years, and none of the
documentation on ES makes it clear if this becomes an issue operationally.
I'm leery to just assume it will "just work". When implementing something
like this, you either have to do a distributed tree for your meta data to
get the partitioning you need to scale infinitely, or every node must store
every shard's master information. How does it work in ES?

Thanks,
Todd

On Friday, September 26, 2014 4:29:53 PM UTC-6, Jörg Prante wrote:

Why do you want to create huge number of indexes on just a single node?

There are smarter methods to scale. Use over-allocation of shards. This
is explained by kimchy in this thread

http://elasticsearch-users.115913.n3.nabble.com/Over-
allocation-of-shards-td3673978.html
http://www.google.com/url?q=http%3A%2F%2Felasticsearch-users.115913.n3.nabble.com%2FOver-allocation-of-shards-td3673978.html&sa=D&sntz=1&usg=AFQjCNEk7KTtpuEot3JtBBmMRMpH25vLDA

TL;DR you can create many thousands of aliases on a single (or few)
indices with just a few shards. There is no limit defined by ES, when your
configuration / hardware capacity is exceeded, you will see the node
getting sluggish.

Jörg

On Fri, Sep 26, 2014 at 11:23 PM, Todd Nine tn...@apigee.com wrote:

Hey guys. We’re building a Multi tenant application, where users
create applications within our single server. For our current ES scheme,
we're building an index per application. Are there any stress tests or
documentation on the upper bounds of the number of indexes a cluster can
handle? From my current understanding of meta data and routing, ever node
caches the meta data of all the indexes and shards for routing. At some
point, this will obviously overwhelm the node. Is my current understanding
correct, or is this information partitioned across the cluster as well?

Thanks,

Todd

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/595db4ab-b2f3-4dfe-bbf0-e4c13926e75e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/595db4ab-b2f3-4dfe-bbf0-e4c13926e75e%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/9HzLNik4D-o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGULM1L_6s%3Dk7SAftA%3D0TM3Jx1haYddnAOLAuHa-4z%2BOw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGULM1L_6s%3Dk7SAftA%3D0TM3Jx1haYddnAOLAuHa-4z%2BOw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2Byzqf8nH_FkR%3Dt0DA8NQf51AMUKLoo%2BYp3Ju3dO8ceMeKvU0w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Todd,

Maybe I missed it or you never said what sort of number of
applications/indexes you are thinking - dozens, hundreds, a few thousand,
10K+?

Some of our SPM http://sematext.com/spm/ users have Elasticsearch
clusters with a dozen or so nodes and 5-10K indexes. Our own Logsene
http://sematext.com/logsene/ has per-user logical index that consists of
lots of time-based indexes and the cluster seems to be doing fine.

You mentioned testing with empty indexes - that's fine, but don't skip
testing with realistic indexes and queries that populate various caches and
eat your heap.

Otis

Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

On Friday, September 26, 2014 5:23:43 PM UTC-4, Todd Nine wrote:

Hey guys. We’re building a Multi tenant application, where users create
applications within our single server. For our current ES scheme, we're
building an index per application. Are there any stress tests or
documentation on the upper bounds of the number of indexes a cluster can
handle? From my current understanding of meta data and routing, ever node
caches the meta data of all the indexes and shards for routing. At some
point, this will obviously overwhelm the node. Is my current understanding
correct, or is this information partitioned across the cluster as well?

Thanks,

Todd

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a8711597-60e6-46ed-bd86-e6673510c9b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.