The OpenShift environment has a security policy that requires containers to run with a randomized UID. This causes issues when running the Elastic Agent. Typically, a standard solution is to ensure the user belongs to group 0 (root), as the root group does not have any special permissions in this context but allows for consistent file access across random UIDs.
The elastic-agent-complete image (and likely the standard elastic-agent image) uses the following IDs:
podman run --rm -it --entrypoint /usr/bin/id elastic/elastic-agent-complete:9.3.1
uid=1000(elastic-agent) gid=1000(elastic-agent) groups=1000(elastic-agent),0(root)
Starting the container results in numerous errors such as:
{
"log.level": "error",
"@timestamp": "2026-03-05T10:02:07.365Z",
"message": "Failed to list light metricsets for module tomcat: getting metricsets for module 'tomcat': loading light module 'tomcat' definition: loading module configuration from '/usr/share/elastic-agent/data/elastic-agent-2ec825/components/module/tomcat/module.yml': config file (\"/usr/share/elastic-agent/data/elastic-agent-2ec825/components/module/tomcat/module.yml\") must be owned by the user identifier (uid=1000730000) or root",
"component": {
"binary": "metricbeat",
"dataset": "elastic_agent.metricbeat",
"id": "beat/metrics-monitoring",
"type": "beat/metrics"
},
"service.name": "metricbeat",
"log.logger": "registry.lightmodules",
"log.origin": {
"file.line": 145,
"file.name": "mb/lightmodules.go",
"function": "github.com/elastic/beats/v7/metricbeat/mb.(*LightModulesSource).ModulesInfo"
},
"resource": {
"service.instance.id": "fe0de429-1ead-41a0-8785-e5a3890390db",
"service.name": "/usr/share/elastic-agent/data/elastic-agent-2ec825/components/elastic-otel-collector",
"service.version": "9.3.1"
},
"otelcol.component.id": "metricbeatreceiver/_agent-component/beat/metrics-monitoring",
"otelcol.signal": "logs",
"log": {
"source": "beat/metrics-monitoring"
},
"ecs.version": "1.6.0",
"otelcol.component.kind": "receiver"
}
Initially, I thought the method was checking whether the files were writable by
the process - and even if error message is quite clear, then it should not fail
when the process can write to the file. I created a minimal Dockerfile to test
this:
FROM elastic/elastic-agent-complete:9.3.1
USER root
RUN usermod -g 0 elastic-agent && \
find / -not -path "/proc/*" -user elastic-agent -exec chmod g+u {} \; && \
find / -not -path "/proc/*" -group elastic-agent -exec chown elastic-agent:root {} \; && \
groupdel elastic-agent
USER elastic-agent
The root group should have all the permissions that the owner normally has, but the error persists. I examined the Beats source code at: beats/libbeat/common/config.go at main · elastic/beats · GitHub
func OwnerHasExclusiveWritePerms(name string) error {
This function checks if the running user (EUID - effective user ID) has exclusive write permissions to the file. The problem is that with a randomized UID, this check will always fail, even though the check technically allows the file to be owned by root.
To maintain write permissions, the group (0 - root, as seen in images prepared by Red Hat) should have 'rw' or 'rwx' permissions.
podman run --rm --entrypoint /usr/bin/id registry.access.redhat.com/ubi10/httpd-24
uid=1001(default) gid=0(root) groups=0(root)
Another potential workaround would be to make all files owned by root, but looking at the code, the second part of the check is:
// Test if group or other have write permissions.
if perm&0022 > 0 {
nameAbs, err := filepath.Abs(name)
if err != nil {
nameAbs = name
}
return fmt.Errorf(`config file ("%v") can only be writable by the `+
`owner but the permissions are "%v" (to fix the permissions use: `+
`'chmod go-w %v')`,
name, perm, nameAbs)
}
This logic causes the current solution to fail consistently in hardened environments where group-write permissions are necessary for functionality.
IMO the current check should be changed to:
- Allowing the group access when euid missmatch