Or you could use a grok filter in stead of csv to parse your lines, if the upload-bw and download-bw do not contain numbers then you have no match and can drop the event based on the _grokparsefailure tag.
filter {
grok {
match => { "message" => "%{WORD:process-name}, %{NUMBER:upload-bw}, %{NUMBER:download-bw}, (?<process-owner>([a-zA-Z]*)), (?<filename>([a-zA-Z\-\_\.]*)), %{WORD:hostname}" }
}
if "_grokparsefailure" in [tags] {
drop { }
}
}
This is just an example, you would have to spend time debugging your grok/regex to match your data, while your csv solution with the ruby code works out of the box.
I'm adding this comment as an option because it might be worth to investage what solution is more "expensive" in resources.
I tried the ruby code you suggested and it works! Well, it's not dropping the whole message, but it empties the upload-bw and download-bw fields so the index doesn't conflict anymore.
I changed it a bit, instead of remove I put event.set(x, 0.0).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.