Kafka Consumer cluster that can read dynamic topics and bulk index the data

Ramky · September 15, 2015, 6:18am

I need to build scalable Kafka consumer cluster, which reads data from dynamically added Kafka topics and bulk index the data to Elastic-search. At present,i developed a java consumer client jar which reads data from given list of Kafka topics and feed them to ES cluster using bulk indexing. I also tried using Spark, but performance is 60% lower than running jar from command line.

I will have new topics added to Kafka cluster on fly. So i need to develop a scalable cluster for consuming various topics of Kafka data and feed the data to ES.

Please suggest me the better way to build a scalable Kafka Consumer and bulk index the data.

warkolm · September 15, 2015, 6:25am

Instead of building that jar you could have just used Logstash - https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html

Not sure what you can do around the dynamic aspect.

tinle · September 15, 2015, 5:23pm

Kafka 0.8.2 or newer supports whitelist and blacklist regex for topics. That is most likely what you are looking for.

I use KCC (kafka-console-consumer) on command and pipe input to ingest Kafka data. KCC supports --whitelist and --blacklist flags.

Ramky · September 16, 2015, 10:00am

Tinle,
Thank you for your response. I will try your suggestion.

Ramky · September 16, 2015, 10:01am

Mark,
Thanks for your suggestion.

Topic		Replies	Views
Assign event from kafka to ES index based on topic name Elasticsearch	1	483	July 5, 2017
Log stash pulling messages from kafka Elasticsearch	2	467	July 5, 2017
Kafka error Logstash	17	3533	July 6, 2017
Indexing data from kafka to elasticsearch Logstash	1	296	March 12, 2019
Kafka to ElasticSearch Elasticsearch	5	4091	July 6, 2017

Kafka Consumer cluster that can read dynamic topics and bulk index the data

Related topics