I'm trying to setup logstash (5.0.0-beta1) to process events from a Kafka (0.10.0.0) topic and index them into a fairly busy Elasticsearch cluster (we've been using Logstash and Elasticsearch for a while, Kafka is the new part).
It basically works, but sometimes (presumably when it takes a while to get the events into elasticsearch) I get a series of warnings like:
[2016-09-30T14:53:46,102][WARN ][org.apache.kafka.clients.consumer.internals.ConsumerCoordinator] Auto offset commit failed for group test-1: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
[2016-09-30T14:53:46,102][INFO ][org.apache.kafka.clients.consumer.internals.ConsumerCoordinator] Revoking previously assigned partitions [test-0] for group test-1
[2016-09-30T14:53:46,102][INFO ][org.apache.kafka.clients.consumer.internals.AbstractCoordinator] (Re-)joining group test-1
And events seem to stop flowing (I suspect it keeps reverting to the current offset). I then try setting the session timeout in the Logstash Kafka input config:
session_timeout_ms => "60000"
and then I get:
[2016-09-28T18:21:41,649][ERROR][logstash.inputs.kafka ] Unable to create Kafka consumer from given configuration {:kafka_error_message=>org.apache.
kafka.common.KafkaException: Failed to construct kafka consumer}
This seems like an issue with the kafka input, but I wanted to make sure I didn't miss something. Also, if anyone familiar with the kafka input can suggest some settings to tune?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.