I have a CSV file with the following 2 rows (sample)
reservation date, reservationID
Jan 6th, res:id:adbcj-oksok-gjkk
Jan 10th,
Mar 10th, res:id:kkbcj-oksok-gjkk
My ask is to drop empty rows and apply a grok filter on reservationID to extract the last elements after the "-". This is what I did without success
csv {
separator => ","
skip_header => "true"
autodetect_column_names => "true"
skip_empty_columns => "true"
skip_empty_rows => "true"
}
if [reservationID] =~ "" {
grok {
reservationID => "MY GROK PATTERN HERE, WHICH IS WORKING FINE EXTERNALLY THROUGH THE DEBUGGER"
}
}
I was expecting the first and the third row in the output (not worried about the grok). Instead I see all 3 rows. Am I missing anything. I do not want the 2nd row in my output.
You do not have any empty rows, so it will not skip any. The second line will not have a [reservationID] field (because you have set skip_empty_columns). So test that:
I did try that and it did not work. I still get the 2nd row. I suspect that the skip_empty_columns condition is stripping the reservationID field even before I get a chance to do what you are suggesting.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.