Kafka publish failed with: circuit breaker is open

hi all, i'm having a problem with the kafka output in that it sends a couple MB worth of log messages right after start, then not a single one. It does not log much about a connection issue, the kafka cluster is healthy, the topics are healthy..

here are some relevant logs from the top

2018-03-09T20:18:10.114Z        INFO    kafka/log.go:36 kafka message: [Initializing new client]
2018-03-09T20:18:10.114Z        INFO    kafka/log.go:36 client/metadata fetching metadata for all topics from broker [[kafka-1.kafka:9092]] 
2018-03-09T20:18:10.115Z        INFO    kafka/log.go:36 Connected to broker at [[kafka-1.kafka:9092]] (unregistered)
2018-03-09T20:18:10.117Z        INFO    kafka/log.go:36 client/brokers registered new broker #[[1001 %!d(string=kafka-1.kafka.logging.svc.cluster.local:9092)]] at %!s(MISSING)
2018-03-09T20:18:10.117Z        INFO    kafka/log.go:36 client/brokers registered new broker #[[1003 %!d(string=kafka-2.kafka.logging.svc.cluster.local:9092)]] at %!s(MISSING)
2018-03-09T20:18:10.117Z        INFO    kafka/log.go:36 client/brokers registered new broker #[[1002 %!d(string=kafka-0.kafka.logging.svc.cluster.local:9092)]] at %!s(MISSING)
2018-03-09T20:18:10.117Z        INFO    kafka/log.go:36 kafka message: [Successfully initialized new client]
2018-03-09T20:18:10.119Z        INFO    kafka/log.go:36 producer/broker/[[1001]] starting up

2018-03-09T20:18:10.121Z        INFO    kafka/log.go:36 producer/broker/[[1001 %!d(string=shared) 2]] state change to [open] on %!s(MISSING)/%!d(MISSING)
2018-03-09T20:18:10.121Z        INFO    kafka/log.go:36 producer/broker/[[1001 %!d(string=kube-system) 1]] state change to [open] on %!s(MISSING)/%!d(MISSING)

2018-03-09T20:18:10.121Z        INFO    kafka/log.go:36 producer/broker/[[1001 %!d(string=boldt5) 2]] state change to [open] on %!s(MISSING)/%!d(MISSING)

2018-03-09T20:18:10.119Z        INFO    kafka/log.go:36 producer/broker/[[1002]] starting up

2018-03-09T20:18:10.121Z        INFO    kafka/log.go:36 producer/broker/[[1001 %!d(string=timeseries) 1]] state change to [open] on %!s(MISSING)/%!d(MISSING)

2018-03-09T20:18:10.119Z        INFO    kafka/log.go:36 producer/broker/[[1003]] starting up

2018-03-09T20:18:10.121Z        INFO    kafka/log.go:36 producer/broker/[[1001 %!d(string=logging) 1]] state change to [open] on %!s(MISSING)/%!d(MISSING)

2018-03-09T20:18:10.121Z        INFO    kafka/log.go:36 producer/broker/[[1003 %!d(string=kube-system) 0]] state change to [open] on %!s(MISSING)/%!d(MISSING)

then, after a first small batch is sent

2018-03-09T20:21:41.933Z        DEBUG   [kafka] kafka/client.go:234     Kafka publish failed with: circuit breaker is open
2018-03-09T20:21:41.938Z        DEBUG   [kafka] kafka/client.go:220     finished kafka batch

versions:

filebeat container image: docker.elastic.co/beats/filebeat:6.2.2
kafka container image: solsson/kafka:0.11.0.0

config:

filebeat.yml -

    filebeat.config:
      prospectors:
        # Mounted `filebeat-prospectors` configmap:
        path: ${path.config}/prospectors.d/*.yml
        # Reload prospectors configs as they change:
        reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false

    processors:
      - add_cloud_metadata:

    output.kafka:
      # client_id: '${POD_NAMESPACE}-${NODE_NAME}'
      hosts: ['kafka-0.kafka:9092', 'kafka-1.kafka:9092', 'kafka-2.kafka:9092']
      topic: '%{[kubernetes.namespace]}'

      partition.round_robin:
        reachable_only: true

      required_acks: 0
      compression: gzip
      max_message_bytes: 1000000

      # version: 0.11.0.0

    logging.level: debug
    # logging.selectors: [kafka,output]

kubernetes.yml -

- type: log
  paths:
    - /var/lib/docker/containers/*/*.log
  json.message_key: log
  json.keys_under_root: true
  processors:
    - add_kubernetes_metadata:
        in_cluster: true
        namespace: ${POD_NAMESPACE}

I guess my question is, how do i even begin to debug this..? Is there a way to see actual connection issues apart from setting logging.level ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.