I need to build scalable Kafka consumer cluster, which reads data from dynamically added Kafka topics and bulk index the data to Elastic-search. At present,i developed a java consumer client jar which reads data from given list of Kafka topics and feed them to ES cluster using bulk indexing. I also tried using Spark, but performance is 60% lower than running jar from command line.
I will have new topics added to Kafka cluster on fly. So i need to develop a scalable cluster for consuming various topics of Kafka data and feed the data to ES.
Please suggest me the better way to build a scalable Kafka Consumer and bulk index the data.