CSV quote char inside data

Hi there,

I'm trying to ingest syslog data with the CSV filter.

I guess the issue is related to the " inside the URL part (acme.org&iid={"-123456**"**:4}&sid=123458&tid=123) of the log, which is the same char as the "quote_char".

Is there an elegant solution to this? (I can't change the input format ;-))

Many thanks!

Logstash Error

[2021-11-16T07:00:10,105][WARN ][logstash.filters.csv     ] Error parsing csv {:field=>"message", :source=>"\"Tue Nov 12 12:12:12 2021\",\"xxx\",\"HTTP\",\"acme.org&iid={\"-123456\":4}&sid=123458&tid=123\",\"Allowed\",\"General Browsing\",\"General Browsing\",\"1332\",\"432\",\"188\",\"188\",\"Business Use\",\"Information Technology\",\"Web Search\",\"None\",\"None\",\"0\",\"None\",\"None\",\"ABC\",\"R&D\",\"\",\"\",\"GET\",\"200\",\"ABC\",\"None\",\"None\",\"None\",\"image/gif\",\"None\",\"123\",\"123\"\n", :exception=>#<CSV::MalformedCSVError: Missing or stray quote in line 1>}

Logstash Pipeline

input {
  syslog {
    port => 1234
    tags => [ "some-logs" ]

filter {
        if "some-logs" in [tags] {
                csv {
                     columns => ["time","login","proto","eurl","action","appname","appclass","reqsize","respsize","stime","ctime","urlclass","urlsupercat","urlcat","malwarecat","threatname","riskscore","dlpeng","dlpdict","location","dept","cip","sip","reqmethod","respcode","ua","ereferer","ruletype","rulelabel","contenttype","unscannabletype","deviceowner","devicehostname"]

In a CSV if a field contains double quotes then the entire field must be enclosed in double quotes, and any double quotes within the field must be escaped with a second double quote (see items 5, 6, and 7 in section 2 of RFC 4180) --

 foo,"a ""b"" c",bar

The underlying Ruby CSV class has a liberal_parsing option that relaxes these requirements, but the csv filter does not set or expose it.

You will need to modify the message format. You may be able to do that using mutate+gsub if the input format is predictable enough.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.