[Fleet Agent] Cannot get kubernetes autodiscovery to work

Hello,

So previously I was running metricbeat, filebeat & heartbeat separately using a different manifest for each. I want to migrate to a single agent instance that would do it all. Here is my config:

apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
  name: agent-magic
  namespace: potato
spec:
  version: 7.14.1
  kibanaRef:
    name: potato
  fleetServerRef:
    name: fleet-server
  mode: fleet
  daemonSet:
    podTemplate:
      spec:
        serviceAccountName: elastic-agent
        hostNetwork: true
        dnsPolicy: ClusterFirstWithHostNet
        automountServiceAccountToken: true
        securityContext:
          runAsUser: 0
        containers:
          - name: agent
            volumeMounts:
              - mountPath: /var/lib/docker/containers
                name: varlibdockercontainers
              - mountPath: /var/log/containers
                name: varlogcontainers
              - mountPath: /var/log/pods
                name: varlogpods
        volumes:
          - name: varlibdockercontainers
            hostPath:
              path: /var/lib/docker/containers
          - name: varlogcontainers
            hostPath:
              path: /var/log/containers
          - name: varlogpods
            hostPath:
              path: /var/log/pods
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: elastic-agent
rules:
  - apiGroups: [""]
    resources:
      - pods
      - nodes
      - namespaces
      - events
      - services
      - configmaps
    verbs:
      - get
      - watch
      - list
  - apiGroups: ["coordination.k8s.io"]
    resources:
      - leases
    verbs:
      - get
      - create
      - update
  - nonResourceURLs:
      - "/metrics"
    verbs:
      - get
  - apiGroups: ["extensions"]
    resources:
      - replicasets
    verbs:
      - "get"
      - "list"
      - "watch"
  - apiGroups:
      - "apps"
    resources:
      - statefulsets
      - deployments
      - replicasets
    verbs:
      - "get"
      - "list"
      - "watch"
  - apiGroups:
      - ""
    resources:
      - nodes/stats
    verbs:
      - get
  - apiGroups:
      - "batch"
    resources:
      - jobs
    verbs:
      - "get"
      - "list"
      - "watch"
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: elastic-agent
  namespace: potato
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: elastic-agent
subjects:
  - kind: ServiceAccount
    name: elastic-agent
    namespace: potato
roleRef:
  kind: ClusterRole
  name: elastic-agent
  apiGroup: rbac.authorization.k8s.io

In the UI I added a log integration and a kubernetes integration. I always get the following error in the agent logs:

2021-09-14T17:52:54.431Z	INFO	[composable.providers.kubernetes]	kubernetes/util.go:114	kubernetes: Using pod name ip-10-xx-xx-xx.ec2.internal and namespace potato to discover kubernetes node
2021-09-14T17:52:54.438Z	ERROR	[composable.providers.kubernetes]	kubernetes/util.go:117	kubernetes: Querying for pod failed with error: pods "ip-10-xx-xx-xx.ec2.internal" not found

The metrics are coming in but without the kubernetes metadata. And logs are just not coming in at all because the logpath /var/log/containers/*${kubernetes.container.id}.log cannot be resolved I assume because of the broken autodiscovery.

if I remove hostNetwork: true, it's able to successfully get the pod information but then I have a bunch of new hosts detected under the name of the pod name rather than the node name.

So I can't seem to find the config to make it work considering there is no agent config that I can edit.

Anybody encountered this problem before?

Phil

As per GitHub issue, this problem went away when running Elastic Agent 7.15.0.

Thanks,
David