I need help in monitoring kafka cluster using metricbeat. When Kafka is on a standalone server, it works fine. When kafka is in kubernetes environment, how can I configure metricbeat to collect the metrics from kafka? is there any auto discovery available for this?
metricbeat indeed has an autodiscovery feature, which works with k8s among others. You can either configure it using templating or use the hints-based autodiscovery.
Error fetching data for metricset kafka.broker: error making http request: Post "http://pipeline-kafka:9092/jolokia/%3FignoreErrors=true&canonicalNaming=false": read tcp 10.5.7.136:34466->172.20.79.178:9092: read: connection reset by peer
Error fetching data for metricset kafka.producer: error making http request: Post "http://pipeline-kafka:9092/jolokia/%3FignoreErrors=true&canonicalNaming=false": read tcp 10.5.7.136:34470->172.20.79.178:9092: read: connection reset by peer
Error fetching data for metricset kafka.partition: error in connect: No advertised broker with address pipeline-kafka:9092 found
telnet to 9092 works fine:
[root@ip-10-5-7-136 metricbeat]# telnet pipeline-kafka 9092
Trying 172.20.79.178...
Connected to pipeline-kafka.
Escape character is '^]'.
Unfortunately connecting with telnet doesn't tell us much about the reason for the TCP connection reset. I'll try to get someone with more Kafka knowledge to take a look at this.
This configuration would instantiate a kafka module for each container whose name contains "kafka". You may need to adjust the condition for your case. ${data.host} will be replaced with the ip of the pod.
These errors may be caused by some kafka metricsets that use jolokia. I am not using them by now in the proposed configuration in purpouse as they require additional configuration.
This error uses to happen when Metricbeat can connect to the broker, but the broker is not advertising the address Metricbeat is using to connect. To investigate this if it continues happening, check the address kafka is advertising, this uses to be logged in kafka startup.
Thanks a lot @jsoriano. The autodiscovery is working. Its trying with IP of the port now.
But still not getting the metrics.
Now getting the below error:
Error fetching data for metricset kafka.consumergroup: error in connect: No advertised broker with address 10.5.7.207:9092 found
Error fetching data for metricset kafka.consumergroup: error in connect: Could not get cluster client for advertised broker with address 10.5.7.50:9092
Ok, this is the problem with the advertised addresses I was mentioning before. Could you check what addresses is Kafka advertising? This is usally logged on startup, not sure if there is other way of retrieving this information.
[2020-09-04 14:56:15,159] WARN [SocketServer brokerId=0] Unexpected error from /10.5.7.136; closing connection (org.apache.kafka.common.network.Selector)
org.apache.kafka.common.network.InvalidReceiveException: Invalid receive (size = 1347375956 larger than 104857600)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:105)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:447)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:397)
at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:678)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:580)
at org.apache.kafka.common.network.Selector.poll(Selector.java:485)
at kafka.network.Processor.poll(SocketServer.scala:913)
at kafka.network.Processor.run(SocketServer.scala:816)
at java.base/java.lang.Thread.run(Thread.java:834)
Does this mean metricbeat is tring to send 1.3GB of request?
yes, its the IP of the metricbeat pod. As it is using the host network, its the IP of the k8s Slave node.
I will get the kafka advertising details. May be the problem lies there.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.