I need to filter events coming from the same source in filebeat level and tag them (filtred not filtred for ex) before sending them to logstash.
And I wonder if there is some options to duplicate events (like clone filter in logstash) without duplicating configurations like this ? :
I am really curious why you need to do this before logstash and cannot do this in logstash...
This would only make sense when you need to split these event and send them to two different logstash end points...
Can you elaborate on this?
The source is a k8s cluster in cloud the elk stack is on premise so need to filter events and duplicate sources for two reasons :
the cost of transferring logs from cloud to onpremis site so I don't need to transfer all logs down.
I need two types of event one withe response and request that some processing is configured on logstash. The other type without debug events and this later is not processed but stored as is in Elasticsearch.
If I understand correctly, your current example configuration sends events with 'RESPONSE' and 'REQUEST' with the tag 'filtered' and all lines Excluding 'DEBUG' with the tag 'not_filtered.'
This means that actually all lines Excluding 'DEBUG' are being send anyway.
Why not use only the second part of your config and add the 'filtered' or 'not_filtered' tag afterwards in Logstash? This way you even limit the number of data going from your 8ks cluster to your onprem environment because you only send events once... right?
but in the logstash we need to send events to two different indices based on tags so the logstash side the configuration is like this :
input {
# beats input configuration goes here
}
filter {
if "filtred" in [tags] {
## do some parsing , processing and so on
}
}
output {
if "filtred" in [tags] {
elasticsearch {
hosts => ["es-host"]
index => "filtred-index"
}
}
if "not_filtred" in [tags] {
elasticsearch {
hosts => ["es-host"]
index => "not-filtred-index"
}
}
}
@Mhag It's difficult without actual example or sanitized data but you could use a processor in filebeat with a regex condition to add the "filtered" tag to lines with "RESPONSE" or "REQUEST".
You could also do this in Logstash.
Then you would use a negative filter to send data to your unfiltered index with:
output {
if "filtred" in [tags] {
elasticsearch {
hosts => ["es-host"]
index => "filtred-index"
}
}
if "filtred" not in [tags] {
elasticsearch {
hosts => ["es-host"]
index => "not-filtred-index"
}
}
}
OR
output {
if "filtred" in [tags] {
elasticsearch {
hosts => ["es-host"]
index => "filtred-index"
}
} else {
elasticsearch {
hosts => ["es-host"]
index => "not-filtred-index"
}
}
}
Because now you have both "filered" and "not_filtered" events in your "not-filtered-index", I think, unless that is on purpose.
Note though that it is better to have an explicit assignment to your not_filtered events, if you have multiple inputs outside of this scope then it events from those inputs might end up in your "not-filtred-index" becuase they don´t contain the "filtred" tag.
Side note: Not sure if you mean filtre (French?) or filter. Just make sure it is consistent, typos are my nemesis
Because now you have both "filered" and "not_filtered" events in your "not-filtered-index", I think, unless that is on purpose.
The goal is to have all events in the not_filtered index, except those with the word "DEBUG," while the filtered index will only have events with the words "RESPONSE" and "REQUEST."
I think I will go with your first proposal so in filebeat config without tagging events :
At logstash level I will do this with cloning events and filtering for tags and words REQUEST and RESONSE :
input {
# Your input configuration goes here
}
filter {
clone {
clones => ["not_filtered", "filtered"]
}
if "filtered" in [tags] {
if "REQUEST" in [message] or "RESPONSE" in [message] {
## do some parsing , processing and so on
}
else {
drop {}
}
}
}
output {
if "filtered" in [tags] {
elasticsearch {
hosts => ["es-host"]
index => "filtered-index"
}
}
if "not_filtered" in [tags] {
elasticsearch {
hosts => ["es-host"]
index => "not-filtered-index"
}
}
}
This way I don't need to duplicate config in filebeat level and I will transfer events once from cloud to on premise.
Yah, and thanks for spoting my typos, this my french invading my english
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.