Error when trying to install fleet server and elastic agent on Openshift K8s Cluster quickstart

Hi all.

I'm trying to follow the quickstart procedure to get a ECK cluster on a Openshift K8s cluster.

I've created a namespace called 'elastic'

After that I have downloaded the operator:

wget https://download.elastic.co/downloads/eck/2.5.0/operator.yaml

Replaced all --> namespace: default
by --> namespace: elastic

and run:

oc create -f https://download.elastic.co/downloads/eck/2.5.0/crds.yaml -n elastic

oc apply -f operator.yaml -n elastic

and created a last filed called 'quickstart_fleet.yaml' with the following content.

I applied it too: oc apply -f quickstart_fleet.yaml -n elastic

apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
  name: fleet-server-quickstart
  namespace: elastic
spec:
  version: 8.5.3
  kibanaRef:
    name: kibana-quickstart
  elasticsearchRefs:
  - name: elasticsearch-quickstart
  mode: fleet
  fleetServerEnabled: true
  deployment:
    replicas: 1
    podTemplate:
      spec:
        serviceAccountName: elastic-agent
        automountServiceAccountToken: true
        securityContext:
          runAsUser: 0
---
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
  name: elastic-agent-quickstart
  namespace: elastic
spec:
  version: 8.5.3
  kibanaRef:
    name: kibana-quickstart
  fleetServerRef:
    name: fleet-server-quickstart
  mode: fleet
  daemonSet:
    podTemplate:
      spec:
        serviceAccountName: elastic-agent
        automountServiceAccountToken: true
        securityContext:
          runAsUser: 0
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: kibana-quickstart
  namespace: elastic
````Preformatted text`
spec:
  version: 8.5.3
  count: 1
  elasticsearchRef:
    name: elasticsearch-quickstart
  config:
    xpack.fleet.agents.elasticsearch.hosts: ["https://elasticsearch-quickstart-es-http.default.svc:9200"]
    xpack.fleet.agents.fleet_server.hosts: ["https://fleet-server-quickstart-agent-http.default.svc:8220"]
    xpack.fleet.packages:
      - name: system
        version: latest
      - name: elastic_agent
        version: latest
      - name: fleet_server
        version: latest
    xpack.fleet.agentPolicies:
      - name: Fleet Server on ECK policy
        id: eck-fleet-server
        is_default_fleet_server: true
        namespace: elastic
        monitoring_enabled:
          - logs
          - metrics
        unenroll_timeout: 900
        package_policies:
        - name: fleet_server-1
          id: fleet_server-1
          package:
            name: fleet_server
      - name: Elastic Agent on ECK policy
        id: eck-agent
        namespace: elastic
        monitoring_enabled:
          - logs
          - metrics
        unenroll_timeout: 900
        is_default: true
        package_policies:
          - name: system-1
            id: system-1
            package:
              name: system
---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch-quickstart
  namespace: elastic
spec:
  version: 8.5.3
  nodeSets:
  - name: default
    count: 3
    config:
      node.store.allow_mmap: false
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: elastic-agent
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - pods
  - nodes
  - namespaces
  verbs:
  - get
  - watch
  - list
- apiGroups: ["coordination.k8s.io"]
  resources:
  - leases
  verbs:
  - get
  - create
  - update
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: elastic-agent
  namespace: elastic
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: elastic-agent
subjects:
- kind: ServiceAccount
  name: elastic-agent
  namespace: elastic
roleRef:
  kind: ClusterRole
  name: elastic-agent
  apiGroup: rbac.authorization.k8s.io

After a while this is what I get

$ oc get pods -n elastic
NAME                                    READY   STATUS    RESTARTS   AGE
elastic-operator-0                      1/1     Running   0          6m57s
elasticsearch-quickstart-es-default-0   1/1     Running   0          6m17s
elasticsearch-quickstart-es-default-1   1/1     Running   0          6m17s
elasticsearch-quickstart-es-default-2   1/1     Running   0          6m17s
kibana-quickstart-kb-785dd6bd-pdzzs     1/1     Running   0          6m9s

CARLOS@DESKTOP-AN5HEQG MINGW64 ~/Desktop/ELK/Openshift curro/samples
$ oc get services -n elastic
NAME                                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
elastic-webhook-server                      ClusterIP   xyz              <none>        443/TCP    7m
elasticsearch-quickstart-es-default         ClusterIP   None             <none>        9200/TCP   6m21s
elasticsearch-quickstart-es-http            ClusterIP   xyz              <none>        9200/TCP   6m24s
elasticsearch-quickstart-es-internal-http   ClusterIP   xyz              <none>        9200/TCP   6m24s
elasticsearch-quickstart-es-transport       ClusterIP   None             <none>        9300/TCP   6m24s
fleet-server-quickstart-agent-http          ClusterIP   xyz              <none>        8220/TCP   6m11s
kibana-quickstart-kb-http                   ClusterIP   xyz              <none>        5601/TCP   6m19s

CARLOS@DESKTOP-AN5HEQG MINGW64 ~/Desktop/ELK/Openshift curro/samples
$ oc get agents -n elastic
NAME                       HEALTH   AVAILABLE   EXPECTED   VERSION   AGE
elastic-agent-quickstart                                             6m36s
fleet-server-quickstart    red                                       6m36s

If I take a look in the openshift project I can see there is a Pod that couldn't get created:

Error creating: pods "fleet-server-quickstart-agent-548d46bb7-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider "my-scc-admin": Forbidden: not usable by user or serviceaccount, provider "pipelines-scc": Forbidden: not usable by user or serviceaccount, provider "nginx-ingress-scc": Forbidden: not usable by user or serviceaccount, spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.containers[0].securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000820000, 1000829999], provider "ibm-restricted-scc": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "ibm-anyuid-hostpath-scc": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "ibm-anyuid-hostaccess-scc": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "ibm-privileged-scc": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]

After investigating I though that running this command could fix the error:

$ oc adm policy add-scc-to-user privileged -z elastic-agent -n elastic
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "elastic-agent"

After a while...

Seems that three elastic agents have been created and they are restarting in an infinite loop with this error:


CARLOS@DESKTOP-AN5HEQG MINGW64 ~/Desktop/ELK/Openshift curro/samples
$ oc get pods -n elastic
NAME                                            READY   STATUS             RESTARTS      AGE
elastic-agent-quickstart-agent-cntrf            0/1     CrashLoopBackOff   3 (41s ago)   2m15s
elastic-agent-quickstart-agent-m7w6x            0/1     CrashLoopBackOff   4 (20s ago)   2m15s
elastic-operator-0                              1/1     Running            0             18m
elasticsearch-quickstart-es-default-0           1/1     Running            0             18m
elasticsearch-quickstart-es-default-1           1/1     Running            0             18m
elasticsearch-quickstart-es-default-2           1/1     Running            0             18m
fleet-server-quickstart-agent-548d46bb7-lkd5k   0/1     CrashLoopBackOff   3 (41s ago)   2m16s
kibana-quickstart-kb-785dd6bd-pdzzs             1/1     Running            0             18m


CARLOS@DESKTOP-AN5HEQG MINGW64 ~/Desktop/ELK/Openshift curro/samples
$ oc logs elastic-agent-quickstart-agent-cntrf
Updating certificates in /etc/ssl/certs...
rehash: warning: skipping ca-certificates.crt,it does not contain exactly one certificate or CRL
1 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Error: preparing STATE_PATH(/usr/share/elastic-agent/state) failed: mkdir /usr/share/elastic-agent/state/data: permission denied
For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/8.5/fleet-troubleshooting.html

CARLOS@DESKTOP-AN5HEQG MINGW64 ~/Desktop/ELK/Openshift curro/samples
$oc logs elastic-agent-quickstart-agent-m7w6x
Updating certificates in /etc/ssl/certs...
rehash: warning: skipping ca-certificates.crt,it does not contain exactly one certificate or CRL
1 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Error: preparing STATE_PATH(/usr/share/elastic-agent/state) failed: mkdir /usr/share/elastic-agent/state/data: permission denied
For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/8.5/fleet-troubleshooting.html



CARLOS@DESKTOP-AN5HEQG MINGW64 ~/Desktop/ELK/Openshift curro/samples
$ oc logs fleet-server-quickstart-agent-548d46bb7-lkd5k
Updating certificates in /etc/ssl/certs...
rehash: warning: skipping ca-certificates.crt,it does not contain exactly one certificate or CRL
1 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Error: preparing STATE_PATH(/usr/share/elastic-agent/state) failed: mkdir /usr/share/elastic-agent/state/data: permission denied
For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/8.5/fleet-troubleshooting.html

And here is where I'm stuck. I don't know if the error I found earlier in Openshift console
"securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000820000, 1000829999]"
could have something to do with the fact of the Pods not being able to run mkdir due to lack of permissions.

Can someone please advise?

Thank you very much. Once again.

Kind regards.

Carlos T.

Hello,

It's seems to be a case similar to what is described here, perhaps it'll help you

Thank you very much Anderson.
I'll take a look and try.

Although my boss boss is assigning me different tasks now.

Best regards and thanks again

1 Like