APM integration on Elastic agent 7.17 - agents are healthy but no data appears

I have tried to set up elastic-agent with APM integration. My configuration is nearly the same as the example here Configuration Examples | Elastic Cloud on Kubernetes [master] | Elastic, except I use version 7.17 for everything, and I have some custom kibana configuration to allow my clients organization to log in using AzureAD.

I have a problem were no APM data appears in Kibana, although the Agents appear to be healthy. Would really appreciate it if someone has some pointers what might be wrong!

I initially set up APM-Server using dedicated ApmServer manifests

apiVersion: apm.k8s.elastic.co/v1
kind: ApmServer
...

This worked well. Then I read that the standalone ApmServer binary is deprecated in version 8, so I tried to switch to using the APM integration. I deleted the ApmServer resource and added fleet-server and elastic-agent instead, with configuration to enable the APM integration.

Now my fleet-server and elastic-agent appear in Kibana and look healthy and show the APM integration as installed,
and I'm able to send telemetry data to the APM-server endpoint using Open Telemetry OTLP, but no data appears in Kibana.
I also don't see any related data streams, would I would expect to hold the APM data.

In kibana app/apm/settings/schema the "Switch to Elastic Agent" option is greyed out, even though I'm logged in to a user with the superuser role.

And no logs appear under Fleet in Kibana

The container logs for elastic-agent and fleet-server are not showing any errors,
but show some

INFO    operation/operator.go:284       operation 'operation-install' skipped for metricbeat.7.17.3
INFO    operation/operator.go:284       operation 'operation-start' skipped for metricbeat.7.17.3
INFO    operation/operator.go:284       operation 'operation-install' skipped for apm-server.7.17.3
INFO    operation/operator.go:284       operation 'operation-start' skipped for apm-server.7.17.3
INFO    operation/operator.go:284       operation 'operation-install' skipped for filebeat.7.17.3
INFO    operation/operator.go:284       operation 'operation-start' skipped for filebeat.7.17.3
INFO    operation/operator.go:284       operation 'operation-install' skipped for metricbeat.7.17.3
INFO    operation/operator.go:284       operation 'operation-start' skipped for metricbeat.7.17.3

.

I will attach a slightly anonymized version of the manifests I deploy shortly.

Here is my kubernetes manifests
(slightly adjusted to anonymize my client)

apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: kibana
spec:
  version: 7.17.3
  count: 1
  podTemplate:
    spec:
      containers:
      - name: kibana
        resources:
          requests:
            memory: 1Gi
            cpu: 0.1
          limits:
            memory: 1Gi
            cpu: 0.25
  elasticsearchRef:
    name: elasticsearch
  http:
    tls:
      selfSignedCertificate:
        disabled: true
  config:
    server.publicBaseUrl: "https://logs.myclient.local"
    xpack.security.authc.providers:
      saml.saml1:
        order: 0
        realm: saml1
        description: "Log in with Active Directory"
        hint: "Log in with a MyClient AD account"
      basic.basic1:
        order: 1
        hint: "Log in with a local Elasticsearch account"
        icon: "logoElasticsearch"
    xpack.fleet.agents.elasticsearch.hosts: ["https://elasticsearch-es-http.default.svc:9200"]
    xpack.fleet.agents.fleet_server.hosts: ["https://fleet-server-agent-http.default.svc:8220"]
    xpack.fleet.packages:
    - name: system
      version: latest
    - name: elastic_agent
      version: latest
    - name: fleet_server
      version: latest
    - name: apm
      version: latest
    xpack.fleet.agentPolicies:
    - name: Fleet Server on ECK policy
      id: eck-fleet-server
      is_default_fleet_server: true
      namespace: default
      monitoring_enabled:
      - logs
      - metrics
      package_policies:
      - name: fleet_server-1
        id: fleet_server-1
        package:
          name: fleet_server
    - name: Elastic Agent on ECK policy
      id: eck-agent
      namespace: default
      monitoring_enabled:
      - logs
      - metrics
      unenroll_timeout: 900
      is_default: true
      package_policies:
      - name: system-1
        id: system-1
        package:
          name: system
      - package:
          name: apm
        name: apm-1
        inputs:
        - type: apm
          enabled: true
          vars:
          - name: host
            value: 0.0.0.0:8200
---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch

spec:
  version: 7.17.3
  auth:
    fileRealm:
    - secretName: credentials-loggingsvc
    roles:
    - secretName: role-kibana-myclient-user
  nodeSets:
  - name: logging
    config:
      node.roles: ["master", "data", "ingest", "ml", "transform"]
      xpack.security.authc.realms.native.native1:
        order: 0
      xpack.security.authc.realms.saml.saml1:
        order: 1
        idp.metadata.path: "https://login.microsoftonline.com/00000000-0000-0000-0000-000000000000/federationmetadata/2007-06/federationmetadata.xml?appid=00000000-0000-0000-0000-000000000000"
        idp.entity_id: "https://sts.windows.net/00000000-0000-0000-0000-000000000000/"
        sp.entity_id:  "https://logs.myclient.local"
        sp.acs: "https://logs.myclient.local/api/security/saml/callback"
        sp.logout: "https://logs.myclient.local/logout"
        attributes.principal: "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name"
        attributes.groups: "http://schemas.microsoft.com/ws/2008/06/identity/claims/role"
        attributes.name: "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname"
        attributes.mail: "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress"

    podTemplate:
      metadata:
        labels:
          app: elasticsearch
      spec:
        # This container sets virtual memory limits in the underlying nodes:
        # https://www.elastic.co/guide/en/cloud-on-k8s/1.1/k8s-virtual-memory.html
        initContainers:
        - name: sysctl
          securityContext:
            privileged: true
          command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
          image: alpine
          resources:
            limits:
              cpu: 100m
              memory: 50Mi
            requests:
              cpu: 100m
              memory: 50Mi
        containers:
        - name: elasticsearch
          resources:
            requests:
              memory: 4Gi
              cpu: 1
            limits:
              memory: 4Gi
              cpu: 1
          env:
          - name: ES_JAVA_OPTS
            value: "-Xms2g -Xmx2g"
    count: 3

    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        storageClassName: var.ES_PREMIUM_STORAGE
        resources:
          requests:
            storage: var.ES_POD_STORAGE_SIZEGi

  http:
    service:
      metadata:
        annotations:
          # This creates a LB with an internal IP address instead of public
          # https://docs.microsoft.com/en-us/azure/aks/internal-lb
          service.beta.kubernetes.io/azure-load-balancer-internal: "true"
      spec:
        type: LoadBalancer
        loadBalancerIP: {load_balancer_ip}
    tls:
      selfSignedCertificate:
        subjectAltNames:
        - ip: {load_balancer_ip}
        - dns: logs.myclient.local
---
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
  name: fleet-server
  namespace: default
spec:
  version: 7.17.3
  kibanaRef:
    name: kibana
  elasticsearchRefs:
  - name: elasticsearch
  mode: fleet
  fleetServerEnabled: true
  deployment:
    replicas: 1
    podTemplate:
      spec:
        serviceAccountName: fleet-server
        automountServiceAccountToken: true
        securityContext:
          runAsUser: 0
---
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
  name: elastic-agent
  namespace: default
spec:
  version: 7.17.3
  kibanaRef:
    name: kibana
  fleetServerRef:
    name: fleet-server
  mode: fleet
  deployment:
    replicas: 1
    podTemplate:
      spec:
        securityContext:
          runAsUser: 0
---
# and the rbac stuff :-)
# I don't have rbac enabled in my kubernetes cluster at the moment, so that shouldn't be a problem...

I found it works if I deploy a brand new Elasticsearch, kibana, elastic-agent, fleet-server. So presumably something had gotten into a bad state. I'll fix it by backing up essential data then deleting and recreating everything.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.