Elastic Agent Fleet Setup unauthorized

When trying to deploy a fleet server via ECK, sometimes the pod is never created. The logs from the operator shows

{"log.level":"error","@timestamp":"2022-10-25T11:42:57.936Z","log.logger":"manager.eck-operator","message":"Reconciler error","service.version":"2.4.0+96282ca9","service.type":"eck","ecs.version":"1.4.0","controller":"agent-controller","object":{"name":"fleet-server","namespace":"default"},"namespace":"default","name":"fleet-server","reconcileID":"93ea354c-2f8b-4546-a42a-b790a8b4b337","error":"failed to request https://chimera-kb-http.default.svc:5601/api/fleet/setup, status is 401)","errorCauses":[{"error":"failed to request https://chimera-kb-http.default.svc:5601/api/fleet/setup, status is 401)"}],"error.stack_trace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234"}
{"log.level":"info","@timestamp":"2022-10-25T11:43:38.897Z","log.logger":"agent-controller","message":"Starting reconciliation run","service.version":"2.4.0+96282ca9","service.type":"eck","ecs.version":"1.4.0","iteration":"151","namespace":"default","agent_name":"fleet-server"}
{"log.level":"info","@timestamp":"2022-10-25T11:43:38.897Z","log.logger":"generic-reconciler","message":"Updating resource","service.version":"2.4.0+96282ca9","service.type":"eck","ecs.version":"1.4.0","kind":"Service","namespace":"default","name":"fleet-server-agent-http"}
{"log.level":"info","@timestamp":"2022-10-25T11:44:38.905Z","log.logger":"agent-controller","message":"Ending reconciliation run","service.version":"2.4.0+96282ca9","service.type":"eck","ecs.version":"1.4.0","iteration":"151","namespace":"default","agent_name":"fleet-server","took":60.007968667}
{"log.level":"error","@timestamp":"2022-10-25T11:44:38.905Z","log.logger":"manager.eck-operator","message":"Reconciler error","service.version":"2.4.0+96282ca9","service.type":"eck","ecs.version":"1.4.0","controller":"agent-controller","object":{"name":"fleet-server","namespace":"default"},"namespace":"default","name":"fleet-server","reconcileID":"d80da767-628e-4211-9158-a4de7b2cf1eb","error":"Post \"https://chimera-kb-http.default.svc:5601/api/fleet/setup\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)","errorCauses":[{"error":"Post \"https://chimera-kb-http.default.svc:5601/api/fleet/setup\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"}],"error.stack_trace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234"}

kubectl get agent shows

NAME                                      HEALTH   AVAILABLE   EXPECTED   VERSION   AGE
agent.agent.k8s.elastic.co/fleet-server                                             5m18s

Logs from Kibana pod

{"type":"log","@timestamp":"2022-10-25T11:42:57+00:00","tags":["info","plugins","security","authentication"],"pid":7,"message":"Authentication attempt failed: {\"error\":{\"root_cause\":[{\"type\":\"security_exception\",\"reason\":\"unable to authenticate user [default-fleet-server-agent-kb-user] for REST request [/_security/_authenticate]\",\"header\":{\"WWW-Authenticate\":[\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\",\"Bearer realm=\\\"security\\\"\",\"ApiKey\"]}}],\"type\":\"security_exception\",\"reason\":\"unable to authenticate user [default-fleet-server-agent-kb-user] for REST request [/_security/_authenticate]\",\"header\":{\"WWW-Authenticate\":[\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\",\"Bearer realm=\\\"security\\\"\",\"ApiKey\"]}},\"status\":401}"}
{"type":"response","@timestamp":"2022-10-25T11:42:57+00:00","tags":[],"pid":7,"method":"post","statusCode":401,"req":{"url":"/api/fleet/setup","method":"post","headers":{"host":"chimera-kb-http.default.svc:5601","user-agent":"Go-http-client/1.1","content-length":"0","kbn-xsrf":"true","x-elastic-product-origin":"cloud","accept-encoding":"gzip"},"remoteAddress":"10.244.1.5","userAgent":"Go-http-client/1.1"},"res":{"statusCode":401,"responseTime":36,"contentLength":323},"message":"POST /api/fleet/setup 401 36ms - 323.0B"}

Sometimes this resolves itself but other times it never seems to create the pod.

the agent yaml is below

kind: Agent
metadata:
  name: fleet-server
  namespace: default
spec:
  version: 7.17.6
  kibanaRef:
    name: chimera
  elasticsearchRefs:
  - name: chimera
  http:
    service:
      spec:
        type: LoadBalancer
        ports:
        - name: https
          port: 443
          targetPort: 8220
          protocol: TCP
    tls:
      certificate:
        secretName: fleet-server-certificate
  mode: fleet
  fleetServerEnabled: true
  policyID: eck-fleet-server
  deployment:
    replicas: 1
    podTemplate:
      spec:
        securityContext:
          runAsUser: 0

Seems to consistently happen when I using a local docker registry and not docker.elastic.co. I'm setting the registry via the ECK operator argument --container-registry=<registry>. That may be a coincidence it definitely seems to be something weird with getting the credentials of the elastic user. While waiting for the agent to start, I deleted the operator pod and once it was back the fleet server agent came up without issue.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.