I am planning to make a log storage system. At present, log information is reported to Kafka through http reporting (the logs are all in json format). I plan to read data from Kafka through Logstash or Filebeat and push it to ES. Considering the performance of Logstash, I plan to use Filebeat. Several issues were considered when planning the plan, and I wanted to discuss it with you:
-
When looking for a solution, I found that most of them chose the link Kafka->Logstash->ES, just because Logstash has a more powerful filter function?
-
My current situation is that Kafka has many topics, and the number is growing dynamically. How should the Filebeat configuration file be set? It seems that it cannot support wildcard configuration, for example: topics: ["topic*"]
-
Suppose I have 1000 Topics, and each Topic has 6 Partitions, then each of my Filebeat is set to the same group_id, and all 1000 Topics are set to consume, is this possible? In theory, there should only be 6 Filebeat instances consuming the same topic at the same time?