We’re working on strengthening our Elastic deployment to better support a multi-tenant environment. Today, we’re handling multi-tenancy by using separate namespaces within agent policies in Fleet, which allows us to segment data per tenant. We also create a dedicated Kibana Space for each client and assign them read-only access to their own data by defining roles restricted to their namespace-specific index patterns.
Where we’re running into challenges is with our shared “Root” Space. Currently, we use the Default Kibana space as this root space, and we have roughly 300 rules enabled that run across all indices, regardless of namespace. Because all rules exist within a single space, our ML jobs are not tenant-aware, and AI features cannot be scoped to analyze only a specific client’s data.
Additionally, alerts generated by the Elastic Security rules are written to the Kibana space–specific alert indices. This prevents us from exposing only a single client’s alerts within their own Kibana Space.
We’ve considered duplicating the rules into each client’s Kibana Space and using space-aware index patterns, but this would mean managing ~300 rules per client instead of ~300 total creating significant operational overhead as we scale.
Our goal is to achieve a true multi-tenant deployment that preserves strict data segmentation while allowing ML and AI features to operate on a per-client (space-aware) basis. We’re hoping to hear from anyone who has successfully implemented a similar multi-tenant architecture and can offer guidance based on the challenges described above.
Found this article to be helpful but does not 100% address our concerns.
Unfortunately, there’s not currently a native way to “multiplex” detection rules by running them once in a root Space and having them write alerts to the respective tenant Spaces.
The typical way this we’ve seen this done is, as you suggest, running a separate instance of the rules in each tenant’s Kibana Space. This ensures that the alerts remain in the proper tenant Spaces as desired, but does come at the expense of increasing the total number of rules running in the cluster.
It is common to modify every rule to "narrow" its scope to the tenant’s namespaced data. This approach puts specific controls in place to keep tenant data isolated, but, as you say, this creates a management overhead. Some users create and deploy detections-as-code tooling to automate the modification and distribution of such rules to the tenant Spaces.
The article you cite offers another approach, which would allow you to run an identical set of rules (still one instance of the rule set per Space) by ensuring that the rules are running with the permissions of a user whose role has access only to the data in the tenant’s namespace. Detection rules run using an API key that carries the permissions of the user that created or last edited the rule. So when the rule runs, even though its index pattern is broad (all tenants), the only data returned to the rule (and thus to any alerts created) will be from the proper tenant’s data. The cautions suggested in that post should be considered. This approach can be susceptible to data isolation errors if you inadvertently edit/create rules as a superuser, or as any role with permission to access data beyond the current tenant’s data.
Similar steps would need to be performed specifically for your ML AD jobs.
To better advise, could you provide information on your current implementation status, whether you are utilizing Elastic Cloud, and the anticipated number of unique tenants per cluster?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.