I need some advice on deploying elasticsearch for multi-tenant application.
I have 3 options:
- 1 index for each tenant
- One big index for all the tenants with default routing
- One big index for all the tenants with custom routing
Expected data volume is around 15GB - 20GB
We have more than 2000 tenants and this number could increase even further. Option 1 would result in memory issue due to the increase in number of shards.
I need advice in choosing between options 2 and 3.
Going with Option2, we can filter based on the tenant id to ensure data separation but the query performance for tenants will be affected as all the shards need to be queried.
Option3 would ensure that we query the specific shard but I think we can run into the issue of "hotspots".
Another issue with Option2 and Option3 is that there is no complete data separation for scoring. The data for different tenants will affect the scoring of other tenants.
Can you please advise me on the best strategy that can be used for our use case?