Filebeat -> Kafka How to best define list of hosts (IPs vs DNS vs Zookeeper)

When using Kafka the producers don't talk to zookeeper, but query metadata directly from kafka. This also simplifies logic in producers a lot, as metadata API contains all information without clients having to rummage zookeeper every now and then. See kafka docs about producer. Partitions in kafka are replicated and the leader of one partition is re-elected now and then. If leader is not available anymore (or leader returns error saying it's not leader anymore), the kafka producer has to ask for the new leader by querying a broker (by random) for the current meta-data (which should be in sync with metadata in zookeeper). See kafka doc regarding replication and leader election and consequences for client in protocol guide. Given kafka still uses zookeeper and metadata should be in sync between all brokers (either via consensus or by relaying meta to zookeeper), I don't see any major problems here.
This is pretty similar to kafka java API by kafka project itself. If there are any downside (besides bootstrap brokers being down) querying metadata via broker instead of zookeeper, the kafka devs should know best.

The metadata (queried via broker) do contain the addresses as being advertised by the individual brokers. That is, one still has to be careful not to advertise the wrong host name (e.g. localhost).

For consumers the offsets can be managed either via zookeeper or via kafka: http://kafka.apache.org/documentation.html#impl_offsettracking . Only having consumer groups, coordination should be done via zookeeper (see docs). Consumers are even discouraged to track offsets via zookeeper, as this is considered deprecated. But, beats being producer only we don't need to take care for this.

To me it seems like kafka is going to discourage clients to use zookeeper in favour of talking to kafka directly (maybe more so un upcoming releases).

Still one wants to configure multiple kafka brokers for bootstrapping the connection - partition finding - process. If bootrapping-brokers are down, clients won't be able to connect to kafka cluster.

1 Like