Generic mapping/schema for a multi tenant use case

elastic-common-schema

(Srinivasan Ramaswamy) #1

We have a multi-tenant use case where we need to index data coming from different tenants. There could be a shared common set of fields between users and each user can add their own fields. There can also be cases where the majority of the fields are different.

How to design a generic mapping/schema that can be applicable to any tenant? It would be a big operational headache to maintain one mapping file per tenant.
Our initial plan was to come up with a generic schema that looks like
string_field_1
string_field_2
.
.
string_field_25
int_field_1
int_field_2
.
.
int_field_25

We are planning to treat the nested fields in a similar way inside a different section. we would have many placeholder fields (as mentioned above) to make sure that all tenants will be supported.
We can map the fields from each tenant to a field above in the generic schema.

I am a little worried about this approach as the tf-idf scores are field based and we are mixing multiple irrelevant fields together. I am also worried whether it would make our ranking queries fragile and way more complicated.

Is that a good approach? Does someone have a better approach that has worked for you well?


(Srinivasan Ramaswamy) #2

One idea I am thinking now is to use the dynamic mapping templates (https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html) and guide elasticsearch on what analyzers to use for what type of fields.

Though we are enabling multiple tenants/clients to store and search their data, we dont want them to lose the flexibility of choosing analyzers they might want. I thought we can use dynamic mapping template to give that flexibility.

One worry we have is about the performance imapct dynamic mapping template might have to index and search process. Does anyone have any experience there ?