I have a multitenant architecture where there is a large set of "master" data between each tenant, then smaller pieces of tenant-specific data.
To query only for a specific tenant, I have a filtered alias where I'm filtering out child documents based on a field only found in child documents. The filter includes all "parent" documents and only the relevant "child" documents for that alias.
When I run a query for child documents on the alias, the results are correctly filtered. When I query for a parent document, and then use "inner_hits", all child documents are returned, not just the ones visible to that alias. Thus, "inner_hits" results are breaking out of the alias filter.
Question #1: Why doesn't this work?
If the answer is "alias filters don't apply to inner-hits", the only other options I see to secure this properly is:
- Document-level permissions for a role per-tenant
- Unfortunately, this means disabling the query cache, which is a non-starter as I have an aggregation-heavy workload that needs caching to be usable.
- An index per-tenant
- If each index has the "master" data, it would be a massive duplication that I want to avoid at any cost.
- If each index is separate from the "master" index, I would need to do all joins on the application side.
Question #2: Can someone help here? If the filtered alias doesn't work on the "inner_hits", there is no scalable way to do this without either
- Re-writing every query to be tenant-specific
- Resorting to doing application-side joins.
- Duplicating all "master" data
- Disabling caching