Logstash/Grok Parsing CSV with columns including Semicolons and escaped doublequotes

Hi all,

I am trying to write a grok parser for a log in CSV format. The different columns are separated by semicolons and each column is included in double quotes.here is an example:
"2017-03-07T08:07:39.585Z";"WARN";"";"";"This is an example customer error Message";

Until recently using a greedy regex looking for the doubly quote to and the field worked fine for me.
CSVCONTENT [^"]*
APPLOG "%{CSVCONTENT:logtime:string}";"%{CSVCONTENT:LogLevel:string}";"%{CSVCONTENT:Col1:string}";"%{CSVCONTENT:Col2:string}";"%{CSVCONTENT:Message:string}";

However some of the Messages we receive started to include Semicolons and double quotes as well. Any double quotes within the Column are escaped by using double doubleqoutes "".

Here is an example
"2017-03-07T08:07:39.585Z";"WARN";"";"";"Search term ""abcde"" not found; Processing stopped";
Expected parsing would be:
logtime: 2017-03-07T08:07:39.585Z
LogLevel: WARN
Col1:
Col2:
Message: Search term ""abcde"" not found; Processing stopped

Obviously my parser fails at the first double quote in front of abcde. As far as I understand it, using doulbe doulbe duotes is a valid escaping of double quotes within a CSV. And as long as quotes are used using the delimiter (semicolon) within the text is also ok, so there is not much I can do to get this changed.
As fields can be empty simply looking for a single occurrence of a double quote instead of a double quote does also not work. So I need to have the string ";" (double quote semicolon double quote) as a separator but I am lacking the regexp and grok experience to implement this. Any hints how this can be done?

Thanks

Chris

How about just using the csv filter?

Main reason is: I did not look into using filters. :slight_smile:
From a first look this should work for me. I will give this a shot. Thanks a lot for your quick help!

Worked perfect out of the box. Thanks again

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.