ECK - Fleet Server agent startup failure

Followed these steps - but I am getting below error during startup

Note* - I am using custom TLS certificates generated using cert-manager -

k logs -f fleet-server-agent-6f976f656c-l5vtd cp: cannot stat '/mnt/elastic-internal/elasticsearch-association/logging/elasticsearch/certs/ca.crt': No such file or directory

May be I am missing the mounts, but was not sure how I can achieve this

Could share the YAML manifest that you use to deploy the Elastic resources?

So..we use custom CA for ECK, hence mounted and used the same path.. got the Agents healthy, but seems there is no data-streams, no errors in pods.. did exec inside the pod elastic-agent status displays everything is healthy.

Even after changing policies, there is no logs or data sent to ES. Seems there is some issue like connection between ES Agent --> ES but not able to see those errors.

It would be gr8, if u can guide us

Below is manifest,

---
# Source: eck/templates/elastic-agent.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: elastic-agent
  namespace: logging
  labels:
    k8s-app: elastic-agent
---
# Source: eck/templates/elastic-agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: elastic-agent
  namespace: logging
  labels:
    k8s-app: elastic-agent
rules:
  - apiGroups: [""] # "" indicates the core API group
    resources:
      - namespaces
      - pods
      - nodes
      - nodes/metrics
      - nodes/proxy
      - nodes/stats
      - events
      - services
      - configmaps
    verbs:
      - get
      - watch
      - list
  - apiGroups: [""]
    resources:
    - secrets
    verbs:
      - get
  - apiGroups: ["coordination.k8s.io"]
    resources:
      - leases
    verbs:
      - get
      - create
      - update
  - nonResourceURLs:
      - "/metrics"
    verbs:
      - get
  - apiGroups: ["extensions"]
    resources:
      - replicasets
    verbs:
      - "get"
      - "list"
      - "watch"
  - apiGroups:
      - "apps"
    resources:
      - statefulsets
      - deployments
      - replicasets
    verbs:
      - "get"
      - "list"
      - "watch"
  - apiGroups:
      - ""
    resources:
      - nodes/stats
    verbs:
      - get
  - apiGroups:
      - "batch"
    resources:
      - jobs
    verbs:
      - "get"
      - "list"
      - "watch"
---
# Source: eck/templates/elastic-agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: elastic-agent
  namespace: logging
subjects:
  - kind: ServiceAccount
    name: elastic-agent
    namespace: logging
roleRef:
  kind: ClusterRole
  name: elastic-agent
  apiGroup: rbac.authorization.k8s.io
---
# Source: eck/templates/elastic-agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: elastic-agent
  # should be the namespace where elastic-elasticagent is running
  namespace: logging
  labels:
    k8s-app: elastic-agent
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
# Source: eck/templates/elastic-agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: elastic-agent-kubeadm-config
  namespace: logging
  labels:
    k8s-app: elastic-agent
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
# Source: eck/templates/elastic-agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  namespace: logging
  name: elastic-agent
subjects:
  - kind: ServiceAccount
    name: elastic-agent
    namespace: logging
roleRef:
  kind: Role
  name: elastic-agent
  apiGroup: rbac.authorization.k8s.io
---
# Source: eck/templates/elastic-agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: elastic-agent-kubeadm-config
  namespace: logging
subjects:
  - kind: ServiceAccount
    name: elastic-agent
    namespace: logging
roleRef:
  kind: Role
  name: elastic-agent-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
# Source: eck/templates/elastic-agent.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: elastic-agent
  namespace: logging
  labels:
    k8s-app: elastic-agent
spec:
  selector:
    matchLabels:
      k8s-app: elastic-agent
  template:
    metadata:
      labels:
        k8s-app: elastic-agent
    spec:
      imagePullSecrets:
        - name: regcredgitlabcom
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      serviceAccountName: elastic-agent
      containers:
        - name: elastic-agent
          image: "redacted.com/eck/eck-operator/beats/elastic-agent:7.14.0"
          env:
            - name: FLEET_ENROLL
              value: "1"
            # Set to true in case of insecure or unverified HTTP
            - name: FLEET_INSECURE
              value: "true"
              # The ip:port pair of fleet server
            - name: FLEET_URL
              value: https://fleet-server-agent-http.logging.svc:8220
              # If left empty KIBANA_HOST, KIBANA_FLEET_USERNAME, KIBANA_FLEET_PASSWORD are needed
            - name: FLEET_ENROLLMENT_TOKEN
              value: "REDACTED"
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
          securityContext:
            runAsUser: 0
          resources:
            limits:
              memory: 500Mi
            requests:
              cpu: 100m
              memory: 200Mi
          volumeMounts:
            - mountPath: /mnt/elastic-internal/elasticsearch-association/logging/elasticsearch/certs
              name: elasticsearch-certs
              readOnly: true
            - name: proc
              mountPath: /hostfs/proc
              readOnly: true
            - name: cgroup
              mountPath: /hostfs/sys/fs/cgroup
              readOnly: true
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: varlog
              mountPath: /var/log
              readOnly: true
      volumes:
        - name: elasticsearch-certs
          secret:
            secretName: elasticsearch-certs
        - name: proc
          hostPath:
            path: /proc
        - name: cgroup
          hostPath:
            path: /sys/fs/cgroup
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        - name: varlog
          hostPath:
            path: /var/log

Hey @sc75651, thanks for your question and the manifests.

I am a bit confused though - are you trying to run Fleet using ECK? That what the recipe you linked does, but the manifest you've pasted tries to run Agents without ECK involvement. Also, the log from your first post indicates that there is a Fleet Server, but I don't see it in your manifest.

To run Fleet you need Elasticsearch, Kibana, Fleet Server and Elastic Agents. Please let me know which of those you wish to run using ECK. You can run all of them, as in the recipe you've linked or only some. If you could show what you have configured and include manifests for all of the above resources I'd be glad to help you further.

Thanks

Sorry for confusion, I tried to run entire setup (Fleet + Agent along with EK) using ECK .. but that (Agent) did not work.. hence tried daemonset approach.

Attache is entire manifest, where Fleet server and agent are green in Kibana ..but not able to see any logs or data stream for both

---
# Source: eck/templates/agent.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: elastic-agent
  namespace: logging
  labels:
    k8s-app: elastic-agent
---
# Source: eck/templates/fleet.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fleet-server
  namespace: logging
  labels:
    k8s-app: fleet-server
---
# Source: eck/templates/agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: elastic-agent
  namespace: logging
  labels:
    k8s-app: elastic-agent
rules:
  - apiGroups: [""] # "" indicates the core API group
    resources:
      - namespaces
      - pods
      - nodes
      - nodes/metrics
      - nodes/proxy
      - nodes/stats
      - events
      - services
      - configmaps
    verbs:
      - get
      - watch
      - list
  - apiGroups: [""]
    resources:
    - secrets
    verbs:
      - get
  - apiGroups: ["coordination.k8s.io"]
    resources:
      - leases
    verbs:
      - get
      - create
      - update
  - nonResourceURLs:
      - "/metrics"
    verbs:
      - get
  - apiGroups: ["extensions"]
    resources:
      - replicasets
    verbs:
      - "get"
      - "list"
      - "watch"
  - apiGroups:
      - "apps"
    resources:
      - statefulsets
      - deployments
      - replicasets
    verbs:
      - "get"
      - "list"
      - "watch"
  - apiGroups:
      - ""
    resources:
      - nodes/stats
    verbs:
      - get
  - apiGroups:
      - "batch"
    resources:
      - jobs
    verbs:
      - "get"
      - "list"
      - "watch"
---
# Source: eck/templates/fleet.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fleet-server
  namespace: logging
  labels:
    k8s-app: fleet-server
rules:
  - apiGroups: [""]
    resources:
      - pods
    verbs:
      - get
      - watch
      - list
  - apiGroups: ["coordination.k8s.io"]
    resources:
      - leases
    verbs:
      - get
      - create
      - update
---
# Source: eck/templates/agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: elastic-agent
  namespace: logging
subjects:
  - kind: ServiceAccount
    name: elastic-agent
    namespace: logging
roleRef:
  kind: ClusterRole
  name: elastic-agent
  apiGroup: rbac.authorization.k8s.io
---
# Source: eck/templates/fleet.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fleet-server
  namespace: logging
  labels:
    k8s-app: fleet-server
subjects:
  - kind: ServiceAccount
    name: fleet-server
    namespace: logging
roleRef:
  kind: ClusterRole
  name: fleet-server
  apiGroup: rbac.authorization.k8s.io
---
# Source: eck/templates/agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: elastic-agent
  # should be the namespace where elastic-agent is running
  namespace: logging
  labels:
    k8s-app: elastic-agent
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
# Source: eck/templates/agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: elastic-agent-kubeadm-config
  namespace: logging
  labels:
    k8s-app: elastic-agent
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
# Source: eck/templates/agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  namespace: logging
  name: elastic-agent
subjects:
  - kind: ServiceAccount
    name: elastic-agent
    namespace: logging
roleRef:
  kind: Role
  name: elastic-agent
  apiGroup: rbac.authorization.k8s.io
---
# Source: eck/templates/agent.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: elastic-agent-kubeadm-config
  namespace: logging
subjects:
  - kind: ServiceAccount
    name: elastic-agent
    namespace: logging
roleRef:
  kind: Role
  name: elastic-agent-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
# Source: eck/templates/services.yaml
# most Elasticsearch configuration parameters are possible to set, e.g: node.attr.attr_name: attr_value
# node.roles: !!seq ""  --> Spinnaker does not like this
# As we don't use ML we can use this as tag to identity the clients for services
---
# Source: eck/templates/agent.yaml
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
  name: elastic-agent
  namespace: logging
  labels:
    helm.sh/chart: eck-1.0.0
    app.kubernetes.io/name: eck
    app.kubernetes.io/instance: RELEASE-NAME
    app.kubernetes.io/version: "1.0.0"
    app.kubernetes.io/managed-by: Helm
spec:
  version: 7.14.0
  mode: fleet
  kibanaRef:
    name: kibana
  fleetServerRef:
    name: fleet-server
  daemonSet:
    podTemplate:
      spec:
        automountServiceAccountToken: true
        containers:
        - env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: FLEET_INSECURE
            value: "true"
          name: agent
          volumeMounts:
          - mountPath: /mnt/elastic-internal/kibana-association/logging/kibana/certs
            name: kibana-kb-es-ca
            readOnly: true
          - mountPath: /mnt/elastic-internal/fleetserver-association/logging/fleet-server/certs
            name: fleet-server-agent-http-certs-internal
            readOnly: true
        dnsPolicy: ClusterFirstWithHostNet
        hostNetwork: true
        imagePullSecrets:
        - name: regcredgitlabcom
        securityContext:
          runAsUser: 0
        serviceAccountName: elastic-agent
        volumes:
        - name: kibana-kb-es-ca
          secret:
            secretName: kibana-kb-es-ca
        - name: fleet-server-agent-http-certs-internal
          secret:
            secretName: fleet-server-agent-http-certs-internal
---
# Source: eck/templates/fleet.yaml
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
  name: fleet-server
  namespace: logging
  labels:
    helm.sh/chart: eck-1.0.0
    app.kubernetes.io/name: eck
    app.kubernetes.io/instance: RELEASE-NAME
    app.kubernetes.io/version: "1.0.0"
    app.kubernetes.io/managed-by: Helm
spec:
  version: 7.14.0
  mode: fleet
  fleetServerEnabled: true
  kibanaRef:
    name: kibana
  elasticsearchRefs:
  - name: elasticsearch
  deployment:
    podTemplate:
      spec:
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: failure-domain.beta.kubernetes.io/zone
                  operator: In
                  values:
                  - us-west-2c
        automountServiceAccountToken: true
        containers:
        - name: agent
          volumeMounts:
          - mountPath: /mnt/elastic-internal/elasticsearch-association/logging/elasticsearch/certs
            name: elasticsearch-certs
            readOnly: true
          - mountPath: /mnt/elastic-internal/kibana-association/logging/kibana/certs
            name: kibana-kb-es-ca
            readOnly: true
        imagePullSecrets:
        - name: regcredgitlabcom
        nodeSelector:
          team: elk
        securityContext:
          runAsUser: 0
        serviceAccountName: fleet-server
        volumes:
        - name: kibana-kb-es-ca
          secret:
            secretName: kibana-kb-es-ca
        - name: elasticsearch-certs
          secret:
            secretName: elasticsearch-es-cert
    replicas: 1

Manifest for EK

---
# Source: eck/templates/storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
allowVolumeExpansion: true
metadata:
  name: elasticsearch-hot
  namespace: logging
parameters:
  encrypted: "true"
  type: gp2
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
---
# Source: eck/templates/storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
allowVolumeExpansion: true
metadata:
  name: elasticsearch-warm
  namespace: logging
parameters:
  encrypted: "true"
  type: st1
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
---
# Source: eck/templates/storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
allowVolumeExpansion: true
metadata:
  name: elasticsearch-cold
  namespace: logging
parameters:
  encrypted: "true"
  type: sc1
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
---
# Source: eck/templates/cert-manager-certs.yaml
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: root-ca-cert
  namespace: logging
spec:
  secretName: root-ca-cert
  commonName: root-ca-cert
  isCA: true
  issuerRef:
    name: root-ca-issuer
    kind: Issuer
  usages:
    - "any"
---
# Source: eck/templates/cert-manager-certs.yaml
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: elasticsearch-es-cert
  namespace: logging
spec:
  isCA: true
  dnsNames:
    - elasticsearch-es-http
    - elasticsearch-es-http:9200
    - elasticsearch-es-http.logging.svc
    - elasticsearch-es-http.logging.svc.cluster.local
  issuerRef:
    kind: Issuer
    name: selfsigned-issuer
  secretName: elasticsearch-es-cert
  usages:
    - "any"
---
# Source: eck/templates/elasticsearch.yaml
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
  namespace: logging
  labels:
    helm.sh/chart: eck-1.0.0
    app.kubernetes.io/name: eck
    app.kubernetes.io/instance: RELEASE-NAME
    app.kubernetes.io/version: "1.0.0"
    app.kubernetes.io/managed-by: Helm
spec:
  version: 7.14.0
  http:
    tls:
      certificate:
        secretName: elasticsearch-es-cert
  nodeSets:
    - config:
        bootstrap:
          memory_lock: true
        indices.memory.index_buffer_size: 30%
        indices.memory.min_index_buffer_size: 96mb
        logger.level: warn
        node.roles:
        - master
        - data_hot
        - data_content
        - remote_cluster_client
        script.painless.regex.enabled: true
        xpack:
          monitoring:
            collection.enabled: true
          security:
            audit:
              enabled: true
          security.authc.api_key.enabled: true
      count: 3
      name: hot
      podTemplate:
        metadata:
          annotations:
            co.elastic.logs/module: elasticsearch
            iam.amazonaws.com/role: Connect-US-Logstash
          labels:
            app: elasticsearch-data-hot
            k8s_cluster: k8s
        spec:
          containers:
          - env:
            - name: ES_JAVA_OPTS
              value: -Xms6g -Xmx6g
            name: elasticsearch
            resources:
              limits:
                cpu: 2
                memory: 8Gi
              requests:
                cpu: 2
                memory: 8Gi
          imagePullSecrets:
          - name: regcredgitlabcom
          initContainers:
          - command:
            - sh
            - -c
            - sysctl -w vm.max_map_count=262144
            name: sysctl
            securityContext:
              privileged: true
          - command:
            - sh
            - -c
            - bin/elasticsearch-plugin install --batch repository-s3 mapper-size
            name: install-plugins
          nodeSelector:
            team: elk
      volumeClaimTemplates:
      - metadata:
          name: elasticsearch-data
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 30Gi
          storageClassName: elasticsearch-hot
    - config:
        bootstrap:
          memory_lock: true
        indices.queries.cache.size: 40%
        indices.requests.cache.size: 25%
        logger.level: warn
        node.roles:
        - master
        - ingest
        - data_warm
        - data_content
        - remote_cluster_client
        - transform
        script.painless.regex.enabled: true
        xpack:
          monitoring:
            collection.enabled: true
          security:
            audit:
              enabled: true
          security.authc.api_key.enabled: true
      count: 2
      name: warm
      podTemplate:
        metadata:
          annotations:
            co.elastic.logs/module: elasticsearch
            iam.amazonaws.com/role: Connect-US-Logstash
          labels:
            app: elasticsearch-data-warm
            k8s_cluster: k8s
        spec:
          containers:
          - env:
            - name: ES_JAVA_OPTS
              value: -Xms6g -Xmx6g
            name: elasticsearch
            resources:
              limits:
                cpu: 2
                memory: 8Gi
              requests:
                cpu: 2
                memory: 8Gi
          imagePullSecrets:
          - name: regcredgitlabcom
          initContainers:
          - command:
            - sh
            - -c
            - sysctl -w vm.max_map_count=262144
            name: sysctl
            securityContext:
              privileged: true
          - command:
            - sh
            - -c
            - bin/elasticsearch-plugin install --batch repository-s3 mapper-size
            name: install-plugins
          nodeSelector:
            team: elk
      volumeClaimTemplates:
      - metadata:
          name: elasticsearch-data
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 125Gi
          storageClassName: elasticsearch-warm
    - config:
        bootstrap:
          memory_lock: true
        indices.queries.cache.size: 40%
        indices.requests.cache.size: 25%
        logger.level: warn
        node.roles:
        - master
        - ingest
        - data_cold
        - data_content
        - remote_cluster_client
        - transform
        script.painless.regex.enabled: true
        xpack:
          security:
            audit:
              enabled: true
          security.authc.api_key.enabled: true
      count: 3
      name: cold
      podTemplate:
        metadata:
          annotations:
            co.elastic.logs/module: elasticsearch
            iam.amazonaws.com/role: Connect-US-Logstash
          labels:
            app: elasticsearch-data-warm
            k8s_cluster: k8s
        spec:
          containers:
          - env:
            - name: ES_JAVA_OPTS
              value: -Xms6g -Xmx6g
            name: elasticsearch
          imagePullSecrets:
          - name: regcredgitlabcom
          initContainers:
          - command:
            - sh
            - -c
            - sysctl -w vm.max_map_count=262144
            name: sysctl
            securityContext:
              privileged: true
          - command:
            - sh
            - -c
            - bin/elasticsearch-plugin install --batch repository-s3 mapper-size
            name: install-plugins
          nodeSelector:
            team: elk
      volumeClaimTemplates:
      - metadata:
          name: elasticsearch-data
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 125Gi
          storageClassName: elasticsearch-cold
    - config:
        bootstrap:
          memory_lock: true
        logger.level: warn
        node.roles:
        - remote_cluster_client
        - ml
        script.painless.regex.enabled: true
        transport.compress: true
        xpack:
          security:
            audit:
              enabled: true
          security.authc.api_key.enabled: true
      count: 3
      name: client
      podTemplate:
        metadata:
          annotations:
            co.elastic.logs/module: elasticsearch
            iam.amazonaws.com/role: Connect-US-Logstash
          labels:
            app: elasticsearch-client
            k8s_cluster: k8s
        spec:
          containers:
          - env:
            - name: ES_JAVA_OPTS
              value: -Xms3g -Xmx3g
            name: elasticsearch
          imagePullSecrets:
          - name: regcredgitlabcom
          initContainers:
          - command:
            - sh
            - -c
            - sysctl -w vm.max_map_count=262144
            name: sysctl
            securityContext:
              privileged: true
          - command:
            - sh
            - -c
            - bin/elasticsearch-plugin install --batch repository-s3 mapper-size
            name: install-plugins
          nodeSelector:
            team: elk
          volumes:
          - emptyDir: {}
            name: elasticsearch-data
---
# Source: eck/templates/cert-manager-certs.yaml
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  name: root-ca-issuer
  namespace: logging
spec:
  selfSigned: {}
---
# Source: eck/templates/cert-manager-certs.yaml
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  name: selfsigned-issuer
  namespace: logging
spec:
  ca:
    secretName: root-ca-cert
---
# Source: eck/templates/kibana.yaml
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: kibana
  namespace: logging
  labels:
    helm.sh/chart: eck-1.0.0
    app.kubernetes.io/name: eck
    app.kubernetes.io/instance: RELEASE-NAME
    app.kubernetes.io/version: "1.0.0"
    app.kubernetes.io/managed-by: Helm
spec:
  version: 7.14.0
  count: 1
  http:
    tls:
      selfSignedCertificate:
        disabled: true
  config:
    elasticsearch.requestHeadersWhitelist:
    - Authorization
    - es-security-runas-user
    elasticsearch.requestTimeout: 100000
    elasticsearch.ssl.verificationMode: none
    enterpriseSearch.ssl.verificationMode: none
    kibana.index: .kibana
    server.maxPayload: 4194304
    server.name: kibana
    xpack.encryptedSavedObjects.encryptionKey: redacted
    xpack.fleet.agents.elasticsearch.host: https://elasticsearch-es-http.logging.svc:9200
    xpack.fleet.agents.fleet_server.hosts:
    - https://fleet-server-agent-http.logging.svc:8220
    xpack.graph.enabled: false
    xpack.license_management.enabled: false
    xpack.ml.enabled: true
    xpack.monitoring.enabled: false
    xpack.monitoring.kibana.collection.enabled: false
    xpack.monitoring.ui.enabled: false
    xpack.reporting.csv.maxSizeBytes: 20971520
    xpack.reporting.enabled: true
    xpack.reporting.encryptionKey: redacted
    xpack.security.enabled: true
    xpack.security.encryptionKey: redacted
    xpack.security.session.idleTimeout: 1h
    xpack.security.session.lifespan: 30d
    xpack.spaces.enabled: true
    xpack.watcher.enabled: false
  elasticsearchRef:
    name: elasticsearch
  podTemplate:
    spec:
      containers:
      - env:
        - name: NODE_OPTIONS
          value: --max-old-space-size=14000
        name: kibana
      imagePullSecrets:
      - name: regcredgitlabcom
      nodeSelector:
        team: elk

Hey @sc75651, thanks for providing the manifests. Using custom certificate for ES shouldn't be an issue as ECK will know to configure Agent to use it tries to talk to ES. I assume you've looked at Pods logs and you didn't found anything suspicious there. As couple of things might be here at play I'll suggest few things:

  1. kubectl exec -it fleet-server-... bash into the Fleet Server and Elastic Agent Pods:
    • ps -auxf and check whether filebeat(s) and metricbeat(s) are running
    • look at log files at /usr/share/elastic-agent/state/data/logs/default, you are most likely to find the culprit there
  2. Drop volumes and volumeMounts from both Agent resources, you shouldn't need them as ECK will make right certs available
  3. Remove FLEET_INSECURE env var
  4. Can you successfully deploy any Fleet recipes from cloud-on-k8s/config/recipes/elastic-agent at master · elastic/cloud-on-k8s · GitHub? If not, what happens?

Thanks,
David

hi @dkow ,

So after using this cloud-on-k8s/fleet-kubernetes-integration.yaml at master · elastic/cloud-on-k8s · GitHub in same namespace as EK Stack i.e logging

Got below errors

k logs -f elastic-agent-agent-728sw
Error: 1 error: open /mnt/elastic-internal/kibana-association/logging/kibana/certs/ca.crt: no such file or directory reading <nil>

k logs -f fleet-server-agent-79dd758dc7-4xtq4
cp: cannot stat '/mnt/elastic-internal/elasticsearch-association/logging/elasticsearch/certs/ca.crt': No such file or directory

Which was the reason to add Volumes and VolumeMounts

Hi @dkow,

Please see this other thread with details on the same issue. It appears that the agent is missing volume mounts for Elasticsearch and Kibana certificates.

I fixed the error about being unable to find the Elasticsearch CA by mounting the correct secret inside the running deployment.

Hey @sc75651 and @keiransteele, thanks for your input and sorry for the delay.

@sc75651, did you apply the entire manifest (including ES/KB) or just Agent/Fleet parts of it? I cannot reproduce your issue by just altering that exact manifest to deploy into non-default namespace.

Based on that and your previous manifests I assume you use a different ES/KB, but I want to confirm.

Thanks,
David

Thanks everyone for your input.

This is caused by a bug in ECK Agent controller. See the other thread for more details.

Thanks,
David

So, after upgrade to 1.8.0 got below error in metricbeat logs of fleet-server

{"log.level":"error","@timestamp":"2021-09-23T18:51:57.519Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/output.go","file.line":154},"message":"Failed to connect to backoff(elasticsearch(https://elasticsearch-es-coordinator-nodes-headless.logging.svc.cluster.local:9200)): Get \"https://elasticsearch-es-coordinator-nodes-headless.logging.svc.cluster.local:9200\": x509: certificate is valid for elasticsearch-es-http, elasticsearch-es-http:9200, elasticsearch-es-http.logging.svc, elasticsearch-es-http.logging.svc.cluster.local, elasticsearch-es-coordinator-nodes, elasticsearch-es-coordinator-nodes:9200, elasticsearch-es-coordinator-nodes.logging.svc, elasticsearch-es-coordinator-nodes.logging.svc.cluster.local, not elasticsearch-es-coordinator-nodes-headless.logging.svc.cluster.local","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-09-23T18:51:57.519Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/output.go","file.line":145},"message":"Attempting to reconnect to backoff(elasticsearch(https://elasticsearch-es-coordinator-nodes-headless.logging.svc.cluster.local:9200)) with 31 reconnect attempt(s)","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-09-23T18:51:57.520Z","log.logger":"publisher","log.origin":{"file.name":"pipeline/retry.go","file.line":219},"message":"retryer: send unwait signal to consumer","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-09-23T18:51:57.520Z","log.logger":"publisher","log.origin":{"file.name":"pipeline/retry.go","file.line":223},"message":"  done","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2021-09-23T18:51:57.527Z","log.logger":"esclientleg","log.origin":{"file.name":"transport/logging.go","file.line":37},"message":"Error dialing x509: certificate is valid for elasticsearch-es-http, elasticsearch-es-http:9200, elasticsearch-es-http.logging.svc, elasticsearch-es-http.logging.svc.cluster.local, elasticsearch-es-coordinator-nodes, elasticsearch-es-coordinator-nodes:9200, elasticsearch-es-coordinator-nodes.logging.svc, elasticsearch-es-coordinator-nodes.logging.svc.cluster.local, not elasticsearch-es-coordinator-nodes-headless.logging.svc.cluster.local","service.name":"metricbeat","network":"tcp","address":"elasticsearch-es-coordinator-nodes-headless.logging.svc.cluster.local:9200","ecs.version":"1.6.0"}

Updating the custom certificates made the trick, it works now