Problem with ECS Log Ingestion in Elasticsearch via Elastic Agent on Kubernetes

Hey there,

I'm experiencing an issue with the ingestion of ECS logs from Kubernetes container. Here’s a brief overview of my environment and the problem:

Environment:

  • Elasticsearch Version: 8.13.2
  • Kibana Version: 8.13.2
  • Elastic Agent: 8.13.2
  • Kubernetes: Deploying Elastic Agent as a DaemonSet to collect logs from various pods

Problem Description:

I have several microservices running in a Kubernetes cluster, and I'm using Elastic Agent to ship logs to Elasticsearch Cluster. The logs from these services are in ECS format. However, when these logs are ingested into Elasticsearch, they appear encapsulated within Kubernetes log metadata. Here's an example of how the logs appear in Kibana:

{
  "_index": ".ds-logs-kubernetes.container_logs-debug-2024.07.11-000001",
  "_id": "",
  "_version": 1,
  "_score": 0,
  "_source": {
    "container": {"runtime": "containerd",
    },
    "kubernetes": {"container": { "name": "test-api"  },
      "node": {
        "hostname": "worker-0",
        "name": "worker-0",
      },
      "pod": {
        "name": "test-api-0"
      },
      "namespace": "elastic-test",
      "annotations": {},
      "replicaset": {
        "name": "test-api-0"
      },
      "namespace_labels": {
        "kubernetes_io/metadata_name": "elastic-test"
      }
    },
    "agent": {
      "name": "worker-0",
      "type": "filebeat",
      "version": "8.13.2"
    },
    "log": {
      "file": {
        "path": "/var/log/containers/test-api-0_elastic-test_default-api-20.log",
      }
    },
    "elastic_agent": {
      "version": "8.13.2",
      "snapshot": false
    },
    "message": "{\"@timestamp\":\"2024-07-11T09:02:46.360593+00:00\",\"log.level\":\"Debug\",\"message\":\"{\\\"PayloadSenderV2\\\"} Enqueued MetricSet. newEventQueueCount: 6 ... ",
    "input": {
      "type": "filestream"
    },
    "@timestamp": "2024-07-11T09:02:46.361Z",
    "ecs": {
      "version": "8.0.0"
    },
    "stream": "stdout",
    "data_stream": {
      "namespace": "debug",
      "type": "logs",
      "dataset": "kubernetes.container_logs"
    },
    "service": {
      "name": "default-api"
    },
  },
  "fields": {
    "elastic_agent.version": [ "8.13.2" ],
    "host.os.name.text": ["Ubuntu" ],
    "host.hostname": ["worker-0" ],
    ...
    "message": ["{\"@timestamp\":\"2024-07-11T09:02:46.360593+00:00\",\"log.level\":\"Debug\",\"message\":\"{\\\"PayloadSenderV2\\\"} Enqueued MetricSet. newEventQueueCount: 6. ... }" ],
    "@timestamp": ["2024-07-11T09:02:46.361Z" ],
    "host.os.platform": ["ubuntu" ],
    ...
    "event.dataset": ["kubernetes.container_logs"]
  }
}

The expected format in Elasticsearch should be kind of :
{\"@timestamp\":\"2024-07-11T09:02:46.360593+00:00\",\"log.level\":\"Debug\",\"message\":\"{\\\"PayloadSenderV2\\\"} Enqueued MetricSet. newEventQueueCount: 6. ... }

It seems like the log message is nested within Kubernetes metadata. Despite trying several configurations, I haven't been able to strip out this metadata and extract the ECS log message properly. I suspect that there might be an issue with the configuration or the ingest pipeline not processing the logs correctly.

Here's my YAML policies agent :

    - name: Elastic_Agent_on_ECK_policy
      id: eck-agent
      namespace: debug
      monitoring_enabled:
      - logs
      - metrics
      unenroll_timeout: 900
      package_policies:
      - package:
          name: system
        name: system-1
      - package:
          name: kubernetes
        name: kubernetes-1
      - package:
          name: apm
        name: apm-1
        inputs:
        - type: apm
          enabled: true
          vars:
          - name: host
            value: 0.0.0.0:8200

Questions:

  • How can I properly configure and the ingest pipeline to strip out the Kubernetes metadata and extract the ECS log message correctly?
  • Are there any best practices or common pitfalls I might be overlooking in this setup?
  • Is there a better approach to handle ECS log ingestion from Kubernetes pods using Elastic Agent and Elasticsearch?