We have a multitenant transactional SAAS application. I am busy building a prototype implementation for ElasticSeach. The idea is to index our transactional data then use it for fast search and analytical aggregations. The current transactional collection has 10m+ documents (500 GB+). Each tenant has a uniqueId and searches/aggregations will mostly be filtered by this id (there are some internal reporting which will be done across all tenants, but this is not the main focus of the solution).
Our system allows tenants to create custom properties. These properties are essential to each tenant and used in reporting and searching. These properties should be isolated per tenant. The custom properties should be part of the mapping as I use the mapping to dynamically present the available filter fields during runtime.
- Option A
This is the current way I am doing it.
I have a single large index, containing all transactions for all tenants. During runtime I add the uniqueId as a filter when searching. At index time, I flatten the custom properties and add the uniqueId as a sub object, so the fields are indexed like “CustomPropertyName.UniqueCustomerId.CustomPropertyValue”, this way I can supply the uniqueId and ‘build’ the search field during runtime and the field is in the mapping as required. I read in https://www.elastic.co/blog/index-vs-type that a single large index is more efficient than smaller indexes, but as this article is quite old, I am not sure if this is relevant any longer.
- Option B
I create separate indexes for each tenant, this way I don’t need to map the customerId as a sub object and it will still be isolated.
The question I have, is searching/aggregating on a smaller index more efficient/quicker than a single large index. Is there a better way to deal with the custom properties? I have read a few forum posts on this topic, but none seem to answer the question.
Very much appreciate your time.