Beats 5.6 and Kafka 1.0 SSL handshake error

I hope I am publishing this is in the right spot, since this involves more than just beats.

Currently I am having issues communicating with kafka using beats (filebeat, metricbeat) with SSL. I’m at the point where I do know what my next step should be. The only hint I can fall back on is the fact that the documentation: https://www.elastic.co/guide/en/beats/filebeat/5.6/kafka-output.html#_compatibility_3 states that the .yml output works with “Kafka 0.8, 0.9, and 0.10” but I’m currently using Kafka 1.0 that came out late of last year. I’m really hoping someone can spot the error before I must rollback my kafka version just to test that this is in fact the case. Here are my configurations:

Versions:

  • Filebeat (rpm -qa) === filebeat-5.6.1-1.x86_64
  • Logstash (rpm -qa) === logstash-5.5.2-1.noarch
  • Kafka === kafka_2.11-1.0.0 == aka kafka 1.0

/usr/share/kafka/config/server.properties of my 1st broker:

listeners=SSL://104.170.104.51:50506
advertised.listeners=SSL://104.170.104.51:50506
ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
ssl.keymanager.algorithm=SunX509
ssl.protocol=TLS
ssl.trustmanager.algorithm=PKIX
#
ssl.client.auth=required
ssl.keystore.location=/usr/share/kafka/ssl/kafka.server.keystore.jks
ssl.truststore.location=/usr/share/kafka/ssl/kafka.server.truststore.jks
ssl.keystore.password= keystore_pass
ssl.key.password= keystore_pass
ssl.truststore.password= truststore_pass
security.inter.broker.protocol=SSL
ssl.truststore.type=JKS
ssl.keystore.type=JKS

...etc

Currently able to get logstash to send data to kafka using the following output.conf:

kafka
  {
    acks => "1"
    bootstrap_servers => "104.170.104.51:50506,104.170.104.52:50506,104.170.104.55:50506"
    codec => json
    topic_id => "partitionThreeHundred"
    security_protocol => "SSL"
    ssl_keystore_location => "/etc/logstash/ssl_kafka/logstash/kafka.client.keystore.jks"
    ssl_keystore_password => "keystore_pass"
    ssl_key_password => " keystore_pass "
    ssl_truststore_location => "/etc/logstash/ssl_kafka/logstash/kafka.client.truststore.jks"
    ssl_truststore_password => "truststore_pass"
  }

So the connection is able to be established, and I'm able to grab the output of kafka with another logstash instance and stdout the messages.

Here are the SSL files I generated in order to communicate with kafka on the logstash servers:

11:34:36 # ll /etc/logstash/ssl_kafka/logstash/
total 8
-rwx------ 1 logstash logstash 3133 Jan  4 10:47 kafka.client.keystore.jks
-rwx------ 1 logstash logstash  901 Jan  4 10:47 kafka.client.truststore.jks

However, I’m unable to get filebeat and metricbeat to complete an SSL handshake with the kafka.

Currently getting the following errors /var/log/filebeat/filebeat:
WARN Failed to connect to broker 104.170.104.55:50506: tls: first record does not look like a TLS handshake
client/metadata got error from broker while fetching metadata:%!(EXTRA tls.RecordHeaderError=tls: first record does not look like a TLS handshake)
client/metadata no available broker to send metadata request to
resurrecting 3 dead seed brokers
Kafka connect fails with: kafka: client has run out of available brokers to talk to

This is error is replicated multiple times, replacing 104.170.104.55 with 1 of the other IP addresses.

CONFIGURATION FOR BEAT YML FILES involving output.kafka:

Filebeat.yml:
output.kafka:
  hosts: ["104.170.104.51:50506", "104.170.104.52:50506", "104.170.104.55:50506"]
  topic: 'partitionThreeHundred'
  partition.round_robin:
      reachable_only: false
  required_acks: 1
  max_message_bytes: 1000000
  ssl.enabled: true
  ssl.certificate_authorities: ["/etc/logstash/ssl_kafka/filebeat/domain_cert.pem"]
  ssl.certificate: "/etc/logstash/ssl_kafka/filebeat/cert_signed.pem"
  ssl.key: "/etc/logstash/ssl_kafka/filebeat/beats_private.pem"
Metricbeat.yml:
output.kafka:
  hosts: ["104.170.104.51:50506", "104.170.104.52:50506", "104.170.104.55:50506"]
  topic: 'partitionThreeHundred'
  partition.round_robin:
      reachable_only: false
  required_acks: 1
  max_message_bytes: 1000000
  ssl.enabled: true
  ssl.certificate_authorities: ["/etc/logstash/ssl_kafka/metricbeat/domain_cert.pem"]
  ssl.certificate: "/etc/logstash/ssl_kafka/metricbeat/cert_signed.pem"
  ssl.key: "/etc/logstash/ssl_kafka/metricbeat/beats_private.pem"

At first, I thought it was because my certificate wasn’t signed, or something involving my private key, here are the files and the locations, permissions are 777 for troubleshooting:

/etc/logstash/ssl_kafka

12:01:40 # ll
total 12
drwxrwxrwx 2 filebeat   filebeat   4096 Jan  4 11:53 filebeat
drwx------ 2 logstash   logstash   4096 Jan  4 11:50 logstash
drwxrwxrwx 2 metricbeat metricbeat 4096 Jan  4 11:53 metricbeat
root@server:/etc/logstash/ssl_kafka
12:01:41 # cd filebeat/
12:01:44 # ll
total 16
-rwxrwxrwx 1 filebeat filebeat 1704 Jan  4 11:53 beats_private.pem
-rwxrwxrwx 1 filebeat filebeat  989 Jan  4 11:53 cert_request.csr
-rwxrwxrwx 1 filebeat filebeat 1123 Jan  4 11:53 cert_signed.pem
-rwxrwxrwx 1 filebeat filebeat 1188 Jan  4 11:52 domain_cert.pem

openssl rsa -noout -modulus -in beats_private.pem | openssl md5
(stdin)= 89c7ff80a524c664b72a35bde1c735c8
openssl x509 -noout -modulus -in cert_signed.pem | openssl md5
(stdin)= 89c7ff80a524c664b72a35bde1c735c8
openssl req -noout -modulus -in cert_request.csr | openssl md5
89c7ff80a524c664b72a35bde1c735c8
openssl verify -verbose -CAfile domain_cert.pem cert_signed.pem
cert_signed.pem: OK

As you can see the md5 is the same for all the files involved. And the domain_cert file has signed the cert_signed.pem file.

Metricbeat folder is giving similar results.


If someone can give me some direction on where to go next that would be wonderful.

A connection to a kafka cluster goes like this:

  1. TCP connection
  2. TLS handshake over TCP connection
  3. Fetch cluster metadata from one broker configured in output.kafka.hosts setting
  4. close TCP connection
  5. for each broker:
    5.1: create TCP connection based on advertised host in metadata
    5.2: TLS handshake over TCP connection
    5.3: (optional) SASL authentication

From beats error message, it seems to fail on step 2 or 5.2.

If you configure only one broker in output.kafka.hosts, you have the error message on the configured host only (step 2 fails) or still one error message per broker (step 5.2 fails)?

You can check connection setup via wireshark (use tcpdump to create a dump). Right after the TCP connection is established, the client sends the SERVER HELO message. The beat is complaining about the response to the HELO message not being valid TLS.

Setting up TLS and certificates can be a quite pain. With all bells and whistles enabled, I normally test step by step. First without TLS (to be sure applications are compatible on application layer protocols). Next I add TLS with server authentication only. If this works, continue with client authentication.

Do you use intermediate CAs? Quite old (not tested for a while), but this is how I setup my certificates for testing with trustchain on localhost: https://gist.github.com/urso/cedfdab25b84c8ee3389ff0727d220ad

This topic was automatically closed after 21 days. New replies are no longer allowed.