Multi-tenant design

abhishekc92 · March 24, 2021, 11:33pm

Hi, I'm currently in the process of designing a multi-tenant indexing strategy for my system. I'm generally aware of the approaches commonly used in such cases -

Index per tenant
Shared index across tenants
Some variation of the above with custom routing based on a tenant id

The nature of my documents are such that they transition between many 'states', and in an initial state will be unassigned to a tenant(without a tenant id), till they eventually get assigned to one of the tenants in the system, at which point the assigned tenant id is immutable. I think this generally rules out using a tenant id as a routing key, as documents created initially are not associated with a tenant. The only straightforward option I can think of functionally is option 2, lumping together all documents in an index and filtering by a tenant id, such that a tenant can retrieve only docs assigned to it. The downside is that due to the nature of my data i am unable to optimize per tenant in any way, having to search across all my documents and then filter. Is there any other way that an Elastic guru can recommend? Thanks

AClerk · March 25, 2021, 1:45am

When you refer to 'tenant'.
Is it a physical host you want to route the data into?
Or it is all logically designed on the same hardware?

I would go with the index per tenant approach.
e.g.
Tenant1 will have indices like tenant1-my-data-source-date kind of naming convention.
Tenant2 will have indices like tenant2-my-data-source-date.
etc. etc.

abhishekc92 · March 25, 2021, 3:02am

Thanks for your reply. I am looking to logically separate the data. However, as I previously mentioned, when the document is first indexed, it does not belong to any tenant, rather sits unassigned in our system. Subsequently, the document is 'assigned' to one of our tenants. So what I am grappling with is the best way to move this document into an appropriate tenant's index(or assign it a routing key) when the document gets updated with tenant information, which i do not have initially.

Christian_Dahlqvist · March 25, 2021, 3:10am

How many tenants do you need to support? What is the expected total data volume? How many concurrent queries do you need to support? How much data are you indexing per day?

AClerk · March 25, 2021, 4:11am

Not sure how you are doing the assigning part.
But at this stage, you might re-index the document to the relevant new index.
So initially you will have to document into a generic index --> tenant0-my-data-source-date
Then re-index the document into the relevant index --> tenantN-my-data-source-date.
This is one of many optional solutions.

abhishekc92 · March 25, 2021, 4:44am

This 'assignment' happens at some instance in time outside this system and the document on ES will be updated with a tenant-id at that point. Yes, indexing into a generic index and re-indexing is an option I'm considering, but just on the face of it, sounds sub-optimal to me. Being new to ES, I was wondering if there is a better option. Atleast till now, it seems like it is a toss up between your suggestion and concluding that ES may not fit our use case very well, and evaluate other solutions. Appreciate the inputs btw!

AClerk · March 25, 2021, 4:49am

Yes, seems like ELK is not optimal, though possible to find a solution.
So when the document is first created without a tenant assigned, what is the purpose of it being in elastic? Do you still search it? Use it for visualisations?
If not, skip this stage and just send the doc when it is moving to a more relevant status.

just thoughts without really understanding your full needs.

abhishekc92 · March 25, 2021, 5:07am

Sure, all great questions. Unfortunately, we would still like to have these documents available to search in its initial state. Agree, skipping is an option if we did not want to search this data. The proposed solution is probably workable starting off, but I foresee issues if and when the data/no. of tenants grow over time.

system · May 13, 2021, 11:24am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
What is best indexing strategy for multitenant data? Elasticsearch	7	3938	July 6, 2017
Designing a massive multi-tenant elasticsearch architecture Elasticsearch	6	8370	July 16, 2019
Multi-tenancy best practices Elasticsearch	1	442	July 6, 2017
Multy-tenany elasticsearch Elasticsearch	1	151	December 21, 2023
Effective separation of tenant data in latest release of ElasticSearch Elasticsearch	10	1140	December 12, 2018

Multi-tenant design

Related topics