Performance of many index aliases

mcfadden · May 23, 2019, 10:40pm

We're running a SAAS platform that utilizes Elasticsearch in a growing capacity. We're looking into improving our scaling ability and several approaches to help mitigate the noisy neighbor problem in shards.

We really like the approach outlined here of per-tenant index aliases with filtering and routing (and the ability this provides for us to split large users off to their own index): https://www.elastic.co/blog/found-multi-tenancy

However, I'm a bit concerned about having 150,000 index aliases and I'd like to consider future growth up to 250,000 tenants. This post discourages having "hundreds of thousands" of aliases, so I'm not sure this is the best approach now: Is there limit for alias indexes?

Is this many index aliases okay or are we getting into multiple-clusters territory?

xeraa · May 25, 2019, 10:19pm

I think Zach's answer is still what we recommend:

Hundreds or thousands of aliases are fine. Hundreds of thousands, millions, etc are not ok

BTW aliases is only one of quite a few architecture considerations. Not sure how you intend to slice the clusters, but Cross Cluster Search might be something you could use as well. I'd generally say that humongous clusters are a bit of a problem because of the blast radius; smaller clusters might be easier to manage.

mcfadden · May 28, 2019, 4:51pm

Thanks @xeraa!

At the moment we have all tenants in a single index* and are doing the filtering manually when we build the queries. I'm considering the aliases as a way to:
a) Enforce our per-tenant filtering
b) Switch to using the tenant id as routing, yet allow us to split some of our biggest consumers off onto their own index to allow them to be across multiple shards.

Our cluster is performing within our expectations at the moment, but we are well aware of the single point of failure we have by being on a single cluster like this.

What other architecture approaches should we be considering?

*We have many indices, as we're indexing many unrelated things for our tenants. Ex People, Events, Media, etc. Each object type has it's own index, but all tenants share the single index for that document type.

warkolm · May 28, 2019, 8:58pm

This is a great idea.

mcfadden · May 31, 2019, 3:08pm

What kinds of things can I test for when experimenting to find a feasible amount of aliases per cluster? Is it only the heap memory or are there other overheads with aliases as well?

xeraa · May 31, 2019, 10:23pm

This will be part of the cluster state. 7.0 is changing the game there, but I would generally make sure that your cluster stays responsive to dropped nodes, index creations, master elections,...

In general I think you will need to figure out if that is working for you and your scenario. We consider that kind of setup as an outlier and IMO don't actively test that.

system · June 28, 2019, 10:23pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multi tenancy drawbacks Elasticsearch	3	356	October 8, 2021
Indexes Scale Elasticsearch	8	1690	July 5, 2017
How many number of aliases should be created on single index? Elasticsearch	9	694	August 26, 2022
Large number of indexes for multi-tenant product Elasticsearch	4	2630	July 5, 2017
Index routing and shard size Elasticsearch	4	375	July 6, 2017

Performance of many index aliases

Related topics