Autodiscover kubernetes pod with multiple containers

I have been trying to use the autodiscover kubernetes provider to retrieve metrics from prometheus endpoints on my pods. In each pod there is also a cloudsql-proxy container, which should not be scraped. The problem here is that when I set the port for the scraping to be one that is exposed on the application container, 3 configs are created (and subsequently merged according to doc as these are equal); 1 for each container, and 1 for the pod itself. This doesn't seem to be a problem until I actually look at the data that ends up in ES, where the container specific fields have values taken from the cloudsql-proxy container spec. This is not what I expected when the port which is used is on the application container, and this container should also be the one supplying the values for fields like kubernetes.container.name etc.

The only way I can get the correct values for these fields is by using hints based autodiscovery. This is by specifically pinning the annotation to the container I want, co.elastic.metrics.containerA/hosts. The problem here is that there is still 3 start events in the log, and I'm seeing some errors in the metricbeat log as the configurations for the cloudsql-proxy container and the pod does not contain any host information: won't start runner: 1 error: host parsing failed for prometheus-collector: error parsing URL: empty host.

So, what I can't seem to figure out is how to get the correct kubernetes.container.name etc. while still only have 1 configuration in metricbeat without any errors in the log. Can anyone shine some light on this?

Btw, I'm using Metricbeat 7.9

My current configuration:

metricbeat.autodiscover:
  providers:
    - type: kubernetes
      node: ${NODE_NAME}
      hints.enabled: true
annotations:
  co.elastic.metrics/module: prometheus
  co.elastic.metrics.nobb-datafetcher/hosts: '${data.host}:8080'
  co.elastic.metrics/metrics_path: "/actuator/prometheus"
  co.elastic.metrics/period: 10s
containers:
  - name: nobb-datafetcher
    image: "eu.gcr.io/dev-mg/nobb-datafetcher:master-980eebc339bd63f9cff79a3591deee8781ccd224"
    ports:
      - name: http
        containerPort: 8080
        protocol: TCP
  - name: cloudsql-proxy
    image: gcr.io/cloudsql-docker/gce-proxy:1.17
    command: ["/cloud_sql_proxy",
              "--term_timeout=30s",
              "-instances=<instance_name>=tcp:5432",
              "-credential_file=/secrets/cloudsql/credentials.json"]

Logs:

2020-09-17T10:29:55.918Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:174	Got a start event: map[config:[0xc0008cf5f0] host:10.24.3.8 id:52cdaa16-588c-48ba-8f29-cc11ab807d59 kubernetes:{"annotations":{"checksum/config":"795b7083dc2644e13d481c325f22402ac25a973279908eaeda1b81c89cd01f73","cni":{"projectcalico":{"org/podIP":"10.24.3.8/32"}},"co":{"elastic":{"metrics":{"nobb-datafetcher/hosts":"${data.host}:8080"},"metrics/metrics_path":"/actuator/prometheus","metrics/module":"prometheus","metrics/period":"10s"}},"kubectl":{"kubernetes":{"io/restartedAt":"2020-08-06T16:24:13+02:00"}}},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"976d76c88"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-lugi"},"pod":{"name":"nobb-datafetcher-976d76c88-jpr8v","uid":"52cdaa16-588c-48ba-8f29-cc11ab807d59"},"replicaset":{"name":"nobb-datafetcher-976d76c88"}} meta:{"kubernetes":{"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"976d76c88"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-lugi"},"pod":{"name":"nobb-datafetcher-976d76c88-jpr8v","uid":"52cdaa16-588c-48ba-8f29-cc11ab807d59"},"replicaset":{"name":"nobb-datafetcher-976d76c88"}}} ports:{"http":8080} provider:34ca019c-b00c-417a-921d-7fd4296d1934 start:true]
2020-09-17T10:29:55.918Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:195	Generated config: {
  "enabled": true,
  "metrics_path": "/actuator/prometheus",
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "10s",
  "timeout": "3s"
}
2020-09-17T10:29:55.918Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:259	Got a meta field in the event
2020-09-17T10:29:55.918Z	ERROR	[autodiscover]	autodiscover/autodiscover.go:209	Auto discover config check failed for config '{
  "enabled": true,
  "metrics_path": "/actuator/prometheus",
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "10s",
  "timeout": "3s"
}', won't start runner: 1 error: host parsing failed for prometheus-collector: error parsing URL: empty host
2020-09-17T10:29:55.918Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:174	Got a start event: map[config:[0xc0009bb7a0] host:10.24.3.8 id:52cdaa16-588c-48ba-8f29-cc11ab807d59.nobb-datafetcher kubernetes:{"annotations":{"checksum/config":"795b7083dc2644e13d481c325f22402ac25a973279908eaeda1b81c89cd01f73","cni":{"projectcalico":{"org/podIP":"10.24.3.8/32"}},"co":{"elastic":{"metrics":{"nobb-datafetcher/hosts":"${data.host}:8080"},"metrics/metrics_path":"/actuator/prometheus","metrics/module":"prometheus","metrics/period":"10s"}},"kubectl":{"kubernetes":{"io/restartedAt":"2020-08-06T16:24:13+02:00"}}},"container":{"id":"67e7258db82e97bf9db86c31dde6c1d51528f35eaf1b25024f990ecc0a6197df","image":"eu.gcr.io/dev-mg/nobb-datafetcher:master-0ee1ce232931c86da11cbc1d9eee680f13537c72","name":"nobb-datafetcher","runtime":"docker"},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"976d76c88"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-lugi"},"pod":{"name":"nobb-datafetcher-976d76c88-jpr8v","uid":"52cdaa16-588c-48ba-8f29-cc11ab807d59"},"replicaset":{"name":"nobb-datafetcher-976d76c88"}} meta:{"kubernetes":{"container":{"image":"eu.gcr.io/dev-mg/nobb-datafetcher:master-0ee1ce232931c86da11cbc1d9eee680f13537c72","name":"nobb-datafetcher"},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"976d76c88"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-lugi"},"pod":{"name":"nobb-datafetcher-976d76c88-jpr8v","uid":"52cdaa16-588c-48ba-8f29-cc11ab807d59"},"replicaset":{"name":"nobb-datafetcher-976d76c88"}}} port:8080 provider:34ca019c-b00c-417a-921d-7fd4296d1934 start:true]
2020-09-17T10:29:55.918Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:195	Generated config: {
  "enabled": true,
  "hosts": [
    "xxxxx"
  ],
  "metrics_path": "/actuator/prometheus",
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "10s",
  "timeout": "3s"
}
2020-09-17T10:29:55.918Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:259	Got a meta field in the event
2020-09-17T10:29:55.919Z	WARN	[tls]	tlscommon/tls_config.go:83	SSL/TLS verifications disabled.
2020-09-17T10:29:55.919Z	DEBUG	[autodiscover]	cfgfile/list.go:63	Starting reload procedure, current runners: 0
2020-09-17T10:29:55.919Z	DEBUG	[autodiscover]	cfgfile/list.go:81	Start list: 1, Stop list: 0
2020-09-17T10:29:55.923Z	DEBUG	[autodiscover]	cfgfile/list.go:100	Starting runner: RunnerGroup{prometheus [metricsets=1]}
2020-09-17T10:29:55.928Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:174	Got a start event: map[config:[0xc000ff5fb0] host:10.24.3.8 id:52cdaa16-588c-48ba-8f29-cc11ab807d59.cloudsql-proxy kubernetes:{"annotations":{"checksum/config":"795b7083dc2644e13d481c325f22402ac25a973279908eaeda1b81c89cd01f73","cni":{"projectcalico":{"org/podIP":"10.24.3.8/32"}},"co":{"elastic":{"metrics":{"nobb-datafetcher/hosts":"${data.host}:8080"},"metrics/metrics_path":"/actuator/prometheus","metrics/module":"prometheus","metrics/period":"10s"}},"kubectl":{"kubernetes":{"io/restartedAt":"2020-08-06T16:24:13+02:00"}}},"container":{"id":"3db2b8c733bf9e7f99b7109f7e5704576e11d4561f544c5c643e9862c7646d4a","image":"gcr.io/cloudsql-docker/gce-proxy:1.17","name":"cloudsql-proxy","runtime":"docker"},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"976d76c88"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-lugi"},"pod":{"name":"nobb-datafetcher-976d76c88-jpr8v","uid":"52cdaa16-588c-48ba-8f29-cc11ab807d59"},"replicaset":{"name":"nobb-datafetcher-976d76c88"}} meta:{"kubernetes":{"container":{"image":"gcr.io/cloudsql-docker/gce-proxy:1.17","name":"cloudsql-proxy"},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"976d76c88"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-lugi"},"pod":{"name":"nobb-datafetcher-976d76c88-jpr8v","uid":"52cdaa16-588c-48ba-8f29-cc11ab807d59"},"replicaset":{"name":"nobb-datafetcher-976d76c88"}}} port:0 provider:34ca019c-b00c-417a-921d-7fd4296d1934 start:true]
2020-09-17T10:29:55.928Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:195	Generated config: {
  "enabled": true,
  "metrics_path": "/actuator/prometheus",
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "10s",
  "timeout": "3s"
}
2020-09-17T10:29:55.928Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:259	Got a meta field in the event
2020-09-17T10:29:55.929Z	WARN	[tls]	tlscommon/tls_config.go:83	SSL/TLS verifications disabled.
2020-09-17T10:29:55.931Z	ERROR	[autodiscover]	autodiscover/autodiscover.go:209	Auto discover config check failed for config '{
  "enabled": true,
  "metrics_path": "/actuator/prometheus",
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "10s",
  "timeout": "3s"
}', won't start runner: 1 error: host parsing failed for prometheus-collector: error parsing URL: empty host

If you look at https://www.elastic.co/guide/en/beats/metricbeat/7.9/configuration-autodiscover.html , there is a section about monitoring of all containers of a pod. I think you need to define a condition that excludes cloudsql-proxy by container name.

I've looked at the docs, but it's not really clear on how to achieve this. It does mention some filtering rules, but I can't find any documentation on using such rules. What I can find is what is the section you're referring to, which talks about exposing ports and specifying ports in the config. I am already specifying port 8080 (which is exposed on only 1 container), and when I try to exclude the cloudsql-proxy container by using the config below, I still see events in the metricbeat log for this container.

templates:
  - condition:
      contains:
        kubernetes.annotations.prometheus.io/scrape: "true"
      not contains:
        kubernetes.container.name: "cloudsql-proxy"

When this "not contains"-restriction is added to the configuration, this is what I see in the logs:

2020-09-21T10:40:07.496Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:174	Got a start event: map[config:[0xc000de2b70] host:10.24.0.11 id:c9594c54-4632-41bd-8c0c-97e97ea35ff3 kubernetes:{"annotations":{"checksum/config":"795b7083dc2644e13d481c325f22402ac25a973279908eaeda1b81c89cd01f73","cni":{"projectcalico":{"org/podIP":"10.24.0.11/32"}},"kubectl":{"kubernetes":{"io/restartedAt":"2020-08-06T16:24:13+02:00"}},"prometheus":{"io/path":"/actuator/prometheus","io/port":"8080","io/scrape":"true"}},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"58f485794b"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-avvi"},"pod":{"name":"nobb-datafetcher-58f485794b-xrf6k","uid":"c9594c54-4632-41bd-8c0c-97e97ea35ff3"},"replicaset":{"name":"nobb-datafetcher-58f485794b"}} meta:{"kubernetes":{"annotations":{"prometheus_io/path":"/actuator/prometheus","prometheus_io/port":"8080","prometheus_io/scrape":"true"},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"58f485794b"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-avvi"},"pod":{"name":"nobb-datafetcher-58f485794b-xrf6k","uid":"c9594c54-4632-41bd-8c0c-97e97ea35ff3"},"replicaset":{"name":"nobb-datafetcher-58f485794b"}}} ports:{"http":8080} provider:1a520e72-e5ea-4dd7-be98-5f91b459b240 start:true]
2020-09-21T10:40:07.496Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:195	Generated config: {
  "hosts": [
    "xxxxx"
  ],
  "metrics_path": "/actuator/prometheus",
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "10s"
}
2020-09-21T10:40:07.497Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:259	Got a meta field in the event
2020-09-21T10:40:07.497Z	DEBUG	[autodiscover]	cfgfile/list.go:63	Starting reload procedure, current runners: 0
2020-09-21T10:40:07.497Z	DEBUG	[autodiscover]	cfgfile/list.go:81	Start list: 1, Stop list: 0
2020-09-21T10:40:07.498Z	DEBUG	[autodiscover]	cfgfile/list.go:100	Starting runner: RunnerGroup{prometheus [metricsets=1]}
2020-09-21T10:40:07.498Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:174	Got a start event: map[config:[0xc000e2b500] host:10.24.0.11 id:c9594c54-4632-41bd-8c0c-97e97ea35ff3.nobb-datafetcher kubernetes:{"annotations":{"checksum/config":"795b7083dc2644e13d481c325f22402ac25a973279908eaeda1b81c89cd01f73","cni":{"projectcalico":{"org/podIP":"10.24.0.11/32"}},"kubectl":{"kubernetes":{"io/restartedAt":"2020-08-06T16:24:13+02:00"}},"prometheus":{"io/path":"/actuator/prometheus","io/port":"8080","io/scrape":"true"}},"container":{"id":"68df574bf7e7ea0b2d5b99d67595a134141616d5055ce1f198a674ce5ba6d647","image":"eu.gcr.io/dev-mg/nobb-datafetcher:master-862ca361112b4bd3ea9524532501524889f16cc0","name":"nobb-datafetcher","runtime":"docker"},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"58f485794b"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-avvi"},"pod":{"name":"nobb-datafetcher-58f485794b-xrf6k","uid":"c9594c54-4632-41bd-8c0c-97e97ea35ff3"},"replicaset":{"name":"nobb-datafetcher-58f485794b"}} meta:{"kubernetes":{"annotations":{"prometheus_io/path":"/actuator/prometheus","prometheus_io/port":"8080","prometheus_io/scrape":"true"},"container":{"image":"eu.gcr.io/dev-mg/nobb-datafetcher:master-862ca361112b4bd3ea9524532501524889f16cc0","name":"nobb-datafetcher"},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"58f485794b"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-avvi"},"pod":{"name":"nobb-datafetcher-58f485794b-xrf6k","uid":"c9594c54-4632-41bd-8c0c-97e97ea35ff3"},"replicaset":{"name":"nobb-datafetcher-58f485794b"}}} port:8080 provider:1a520e72-e5ea-4dd7-be98-5f91b459b240 start:true]
2020-09-21T10:40:07.498Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:195	Generated config: {
  "hosts": [
    "xxxxx"
  ],
  "metrics_path": "/actuator/prometheus",
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "10s"
}
2020-09-21T10:40:07.498Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:259	Got a meta field in the event
2020-09-21T10:40:07.499Z	DEBUG	[autodiscover]	cfgfile/list.go:63	Starting reload procedure, current runners: 1
2020-09-21T10:40:07.499Z	DEBUG	[autodiscover]	cfgfile/list.go:81	Start list: 0, Stop list: 0
2020-09-21T10:40:07.499Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:174	Got a start event: map[config:[0xc000bb4a80] host:10.24.0.11 id:c9594c54-4632-41bd-8c0c-97e97ea35ff3.cloudsql-proxy kubernetes:{"annotations":{"checksum/config":"795b7083dc2644e13d481c325f22402ac25a973279908eaeda1b81c89cd01f73","cni":{"projectcalico":{"org/podIP":"10.24.0.11/32"}},"kubectl":{"kubernetes":{"io/restartedAt":"2020-08-06T16:24:13+02:00"}},"prometheus":{"io/path":"/actuator/prometheus","io/port":"8080","io/scrape":"true"}},"container":{"id":"5edd4e42bfebd6d783fc1e1365fccb601d2cb0b5b1a8fc6106260064938e4ba9","image":"gcr.io/cloudsql-docker/gce-proxy:1.17","name":"cloudsql-proxy","runtime":"docker"},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"58f485794b"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-avvi"},"pod":{"name":"nobb-datafetcher-58f485794b-xrf6k","uid":"c9594c54-4632-41bd-8c0c-97e97ea35ff3"},"replicaset":{"name":"nobb-datafetcher-58f485794b"}} meta:{"kubernetes":{"annotations":{"prometheus_io/path":"/actuator/prometheus","prometheus_io/port":"8080","prometheus_io/scrape":"true"},"container":{"image":"gcr.io/cloudsql-docker/gce-proxy:1.17","name":"cloudsql-proxy"},"labels":{"app_kubernetes_io/instance":"nobb-datafetcher","app_kubernetes_io/name":"nobb-datafetcher","pod-template-hash":"58f485794b"},"namespace":"nobb-datafetcher","node":{"name":"gke-datahub-gke-cluster-node-pool-1-6db93494-avvi"},"pod":{"name":"nobb-datafetcher-58f485794b-xrf6k","uid":"c9594c54-4632-41bd-8c0c-97e97ea35ff3"},"replicaset":{"name":"nobb-datafetcher-58f485794b"}}} port:0 provider:1a520e72-e5ea-4dd7-be98-5f91b459b240 start:true]
2020-09-21T10:40:07.500Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:195	Generated config: {
  "hosts": [
    "xxxxx"
  ],
  "metrics_path": "/actuator/prometheus",
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "10s"
}

There are 3 start events, and the container fields are still from the cloudsql-proxy container.