Multi tenant elasticsearch deployment

I need some advice on deploying elasticsearch for multi-tenant application.
I have 3 options:

  1. 1 index for each tenant
  2. One big index for all the tenants with default routing
  3. One big index for all the tenants with custom routing

Expected data volume is around 15GB - 20GB

We have more than 2000 tenants and this number could increase even further. Option 1 would result in memory issue due to the increase in number of shards.

I need advice in choosing between options 2 and 3.

Going with Option2, we can filter based on the tenant id to ensure data separation but the query performance for tenants will be affected as all the shards need to be queried.

Option3 would ensure that we query the specific shard but I think we can run into the issue of "hotspots".

Another issue with Option2 and Option3 is that there is no complete data separation for scoring. The data for different tenants will affect the scoring of other tenants.

Can you please advise me on the best strategy that can be used for our use case?

Given the size of the data, even if it doubled, it's huge, I would just go with option 2.

Thanks @warkolm for the reply.

The only issue with having one big sharded index is that there is no complete data separation for scoring. The data for different tenants will affect the scoring of other tenants. Is there a way that this problem can be addressed?

What if you use a filter and make sure you have a customer ID field of some kind.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.