Kafka Bug?

We are trying unsuccessfully to connect Filebeat to Kafka using SSL. We have dropped Kafka back to the "supported" version, and still we fail. In the course of troubleshooting, we found the following :slight_smile:
1: Our Kafka server is configured to use port 9093 for SSL. We disabled port 9092 altogether. Our Filebeats config file is configured to connect on 9093.
2: We connect to Kafka on 9093, and we get a list of the brokers. Yay, me! Sadly, when it tries to connect to the brokers it fails with the following error:

2017/06/23 19:24:36.289851 log.go:12: WARN Failed to connect to broker hostnamedeleted:9093: tls: first record does not look like a TLS handshake

3: I shake my fists at the sky cursing SSL! I check to see what happened to the connections with a netstat -tulpan, and discover something evil:
tcp 1 0 172.16.123.104:37274 ipdeleted:9092 CLOSE_WAIT 14956/filebeat
tcp 1 0 172.16.123.104:55792 ipdeleted.59:9092 CLOSE_WAIT 14956/filebeat
tcp 1 0 172.16.123.104:50669 ipdeleted:9092 CLOSE_WAIT 14956/filebeat

In spite of the config file only listing port 9093, and the Kafka server only using port 9093, Filebeats is still talking on 9092.

Why does this happen?

1 Like

Connecting a producer to a kafka cluster happens in two steps.

The first step is the bootstrapping process. This is the kafka brokers you initially configure in the kafka output in filebeat (or any kafka producer). On bootstrap, one of the configured brokers is choosen by random for the cluster meta data. The meta data do contain information about all known brokers, topics and partitions. The brokers information do contain the brokers advertised addresses (hostnames and ports).

Step two is the kafka client spinning up a worker for each broker, each trying to connect to one of the advertised addresses.

When a kafka broker joins a kafka cluster, it broadcasts its public "advertised address" to all other brokers in the cluster, so they can update the cluster metadata. A common error is to have a broker advertise a wrong hostname (e.g. locahost) and or port. Check your kafka server config files for:
advertised.host.name, advertised.listeners, advertised.port, host.name, listeners, port

Turns out the problem may have been that the certificate format we were given from Kafka was not supported. When a new certificate was generated we were able to connect. That certificate worked with OpenSSL so we assumed it was ok.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.