LogStash , Kafka , multiple Topics , rebalancing one topic


I have a 3 ELK cluster 5.4.1 with 3 nodes , 12 CPU , 32 GB RAM , windows 2012 R2 , VM on VMWare. They pull data from Kafka. The sources are windows log events , dns logs and syslog from network devices. windows logs are around 500 per sec , dns logs around 5000 per sec , syslog around 6000 per sec. Higher during peakhours

The sources are divided into 3 topics in kafka. 3 partitions with 1 replica per topic.

In Logstash I have tried 2 approaches.

  1. I have tried using one logstah Kafka input with multiple topics in a array.
  2. Separate input logstash kafka plugins per topic.

However for some reason my DNS logs are consistently falling behind. The other logs are fine. I observe in the Kafka log , that Kafka is rebalancing my DNS topic. It happens every minute , when the problem is occurring.

I see that rebalancing with Kafka seems to a quite common issue. Some recommend increasing session timeout , which I tried , but didnt change anything.

So my questions:

  1. should multiple topics be in separate logstask kafka inputs ? does it matter ?
  2. Recommended performance tuning settings in regards to logstash and kafka ?
  3. Reduce number of topics to 1 ? does it matter
  4. Known issue ?
  5. how to monitor kafka input plugin ?
  6. I was thinking to skip kafka , now that we have persistent queues in logstash ?

Or just add more nodes ? ...


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.