I am trying to write a grok parser for a log in CSV format. The different columns are separated by semicolons and each column is included in double quotes.here is an example:
"2017-03-07T08:07:39.585Z";"WARN";"";"";"This is an example customer error Message";
Until recently using a greedy regex looking for the doubly quote to and the field worked fine for me.
However some of the Messages we receive started to include Semicolons and double quotes as well. Any double quotes within the Column are escaped by using double doubleqoutes "".
Here is an example
"2017-03-07T08:07:39.585Z";"WARN";"";"";"Search term ""abcde"" not found; Processing stopped";
Expected parsing would be:
Message: Search term ""abcde"" not found; Processing stopped
Obviously my parser fails at the first double quote in front of abcde. As far as I understand it, using doulbe doulbe duotes is a valid escaping of double quotes within a CSV. And as long as quotes are used using the delimiter (semicolon) within the text is also ok, so there is not much I can do to get this changed.
As fields can be empty simply looking for a single occurrence of a double quote instead of a double quote does also not work. So I need to have the string ";" (double quote semicolon double quote) as a separator but I am lacking the regexp and grok experience to implement this. Any hints how this can be done?