Connecting kafka and metricbeat dockers with elastic cloud

I have this weird error where I am getting contradictory logs and no output for the Kafka module in metricbeat.

I have a docker compose that consists of a ZK + Kafka cluster where the main broker is kafka-sscc-1:9092 and has the option ports: 9092:9092

In this docker compose i also have the following:

metricbeat:
  image: docker.elastic.co/beats/metricbeat:7.5.2
  container_name: metricbeat
  user: root
  environment:
    - strict.perms=false
  #network_mode: host
  volumes:
    - './metricbeat.docker.yml:/usr/share/metricbeat/metricbeat.yml:ro'
    - '/var/run/docker.sock:/var/run/docker.sock:ro'
    - '/sys/fs/cgroup:/hostfs/sys/fs/cgroup:ro'
    - '/proc:/hostfs/proc:ro'
    - '/:/hostfs:ro'

And this is my metricbeat.docker.yml file:

metricbeat.config:
  modules:
    path: ${path.config}/modules.d/*.yml
    # Reload module configs as they change:
    reload.enabled: false

metricbeat.autodiscover:
  providers:
    - type: docker
      hints.enabled: true

metricbeat.modules:
- module: kafka
  metricsets: ["consumergroup", "partition"]
  period: 10s
  hosts: ["kafka-sscc-1:9092"]
  enabled: true

output.elasticsearch:
  hosts: <private vpn host>
  username: 'elastic'
  password: <redacted>
  ssl.verification_mode: none

I was having trouble connecting to "private vpn host" but restarting the docker daemon fixed the issue.

However, now I am getting the following logs in my metricbeat docker:

metricbeat      | 2020-02-12T11:13:18.902Z	INFO	kafka/log.go:53	Failed to connect to broker localhost:9092: dial tcp 127.0.0.1:9092: connect: connection refused
metricbeat      | 2020-02-12T11:13:18.902Z	INFO	kafka/log.go:53	Failed to connect to broker localhost:9092: dial tcp 127.0.0.1:9092: connect: connection refused
metricbeat      | 2020-02-12T11:13:18.903Z	INFO	kafka/log.go:53	Failed to connect to broker localhost:9092: dial tcp 127.0.0.1:9092: connect: connection refused
metricbeat      | 2020-02-12T11:13:18.903Z	INFO	module/wrapper.go:252	Error fetching data for metricset kafka.partition: error in connect: failed to query metadata: dial tcp 127.0.0.1:9092: connect: connection refused
metricbeat      | 2020-02-12T11:13:18.903Z	INFO	kafka/log.go:53	Failed to connect to broker localhost:9092: dial tcp 127.0.0.1:9092: connect: connection refused
metricbeat      | 2020-02-12T11:13:18.904Z	INFO	module/wrapper.go:252	Error fetching data for metricset kafka.consumergroup: error in connect: failed to query metadata: dial tcp 127.0.0.1:9092: connect: connection refused
metricbeat      | 2020-02-12T11:13:26.032Z	INFO	kafka/log.go:53	Connected to broker at kafka-sscc-1:9092 (unregistered)
metricbeat      | 2020-02-12T11:13:26.034Z	INFO	kafka/log.go:53	Closed connection to broker kafka-sscc-1:9092
metricbeat      | 2020-02-12T11:13:26.583Z	INFO	kafka/log.go:53	Connected to broker at kafka-sscc-1:9092 (unregistered)
metricbeat      | 2020-02-12T11:13:26.626Z	INFO	kafka/log.go:53	Closed connection to broker kafka-sscc-1:9092

My questions are:

  • why is metricbeat trying to connect to localhost? At no point in the files I tell him that the host is localhost, I always mention kafka-sscc-1

  • what do these errors mean? what does dial tcp mean?:

    metricbeat | 2020-02-12T11:17:48.905Z INFO module/wrapper.go:252 Error fetching data for metricset kafka.partition: error in connect: failed to query metadata: dial tcp 127.0.0.1:9092: connect: connection refused

    metricbeat | 2020-02-12T11:18:18.907Z INFO module/wrapper.go:252 Error fetching data for metricset kafka.consumergroup: error in connect: failed to query metadata: dial tcp 127.0.0.1:9092: connect: connection refused

  • why are those errors pointing at 127.0.0.1:9092 ?

  • ultimately, how do I fix them?

Thanks for your help, let me know if you need to know anything else.

Solved by adding:

depends_on:
  - kafka-sscc-1
  - ...

To the metricbeat part of the docker compose.