[ES 2.3] Modeling a heterogenous multi-tenant document store

Hi folks,

I'm using ES 2.3. Trying to come up with the a data model that responds to the following requirements:

  • multi-tenant document storage for ~1000 tenants
  • heterogenous document schema, each tenant may have up to 20 different document types, each type is unique - so could theoretically end up with 20K different document types
  • document schemas may be updated with the expectation that documents are reindexed without downtime for reads; it is ok to block writes

I would appreciate any suggestions, considerations and recommendations on how to best model this.

Thank you,
Ionut

Hi @margelatu,

for starters, you can read a multi-tenancy related blog post by our Cloud engineer Konrad.

I think putting all tenants into one index is out of question, so you are left with:

  • Index per customer
  • Cluster per customer

With the index per customer solution you will never have full isolation between tenants. Should your tenants have direct access to their index, it's very hard to guarantee isolation and I'd recommend you use a cluster per tenant. If your customers only interact with Elasticsearch via your application (which does the access control), then index per customer might be an option.

One index per customer can also be problematic from a performance perspective as you cannot enforce any resource limits per tenant. So if one of them is running e.g. heavy aggregations it will have an impact on others.

If you want to go down the cluster per customer route, you should look into Docker which allows you to constrain resource usage. If you are looking for a turn-key solution for managing this amount of clusters you might also be interested in Elastic Cloud Enterprise (disclaimer: I work for Elastic).

Daniel

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.