Hello there ,
I will get syslog messages from firewall devices and send them to kafka as a producer. Because there may be several consumers which will consume events so they may be on different offsets.
Now my question is , probably it will depend on server side specifications and network also but if there is a hardcap on logstash capacity for processing events.
There is like 70-120k TPS according to hours, Pretty sure one logstash wont be able to handle even without any filters to process incoming data.
I will just stream data , from syslog to kafka.
Just curious about how to setup my logstash cluster.
Will i need beats before kafka?
Only 1 logstash is ok ? output as 3,4 other logstash as loadbalance true?
As a best option i guess i have to push different FW devices to send to different logstash ip,ports for better input performance at start.
If one Logstash is enough will depend on the logstash server specs and the pipeline configuration, what filters and outputs you are going to use, the only way to know is testing.
How are you planning to get the messages from your firewalls into Kafka? Normally you can configure the device to send the logs to a syslog server, not directly to Kafka where you need a client.
I have some use cases where one logstash server work as a listener to receive the firewall messages and send them into Kafka, for this I use the udp input and the kafka output.
After that I have other logstash servers that consume from the same Kafka topic using the same group id.
Since you are planning in use Kafka for this, you can scale your logstash servers pretty easy if you need, just use the same group id fo all the consumers of the same topic, this way you will not get duplicates, also try to set the number of partitions for your topic equals to the same number of logstash servers.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.