I have multiple firewall log sources. All these sources push the logs in a dedicated topic named "firewall".
In logstash, I would like to have a different pipeline for each of these sources to apply different processing and use different index. For example, one pipeline for Fortigate and one pipeline for Juniper.
Let's check with the following Fortigate pipeline (I didn't changed kafka group id which is by default "logstash") :
If you do not define an output section then I do not think the pipeline will be executed, however, the output section does not have to send anything to an output. It is OK if the conditional is never true.
I made a test (no output,no filter, just my kafka input) and when looking at kafka metrics on a kafka machine (kafka-consumer-group.sh ...), I can see logs are consumed as lag is not increasing. If it was not consumed, this would not be the case (in my understanding of how kafka works)
So even if I don't have any output, pipeline is executed.
I think that you misunderstand how Kafka works. If you want two different pipelines to both be able to consume all of the events in a topic, each pipeline must be configured for a separate consumer group using the group_id option.
You haven't really filtered on anything. At the moment it doesn't look like you are thinking about this problem the right way. I believe what you really are trying to build is this...kafka_logstash_siem.pdf
Not really. In your kafka input you assign the tag "Fortigate" to all of the messages consumed from the "firewall" topic. Then in the output you check to see if "Fortigate" is a tag. Of course it is always a tag because you assigned it to every event in the input. So the end result is that you haven't filtered anything.
Yes... You've got a point ! I didn't updated my post but after some tests I removed the tag at this input and I added it on the logstash which as a collector :
Fortigate -> Logstash collector (where I add the tag for the fortigate syslog input) -> Kakfa -> Logstash (which do processing and use the tag added in the previous logstash for the kafka input)
If the collector is adding the tag, why use a tag at all? Why not just produce the record to a Kafka topic called "fortigate"? The other pipeline consumes from the "fortigate" topic. You then no longer need a filter in the output, because this pipeline gets ONLY fortigate events. You also avoid consuming other logs that aren't fortigate and having to discard them.
Yes. I could do that. I thought that, maybe, it was better to minimize the number of topics... To facilitate maintenance. Maybe this is not a good idea after all
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.