Message field split into multiple entries from CSV input

#1

I'm running into an issue where my logstash message field is being split into multiple messages. I have it ingesting a CSV. There are a couple of gsub filters happening to remove "," delimiters and replace them with "^" - this is due to having "," contained within multi value strings . After the new delimiters are set, I'm removing all double quotes.

Here is a line of source data:

,,,412,192.168.65.217,,00:50:56:A7:C3:C3,ubuntu,Ubuntu Linux,Ubuntu Linux,16.04,,"2,816",,0,,,0,,308,Full audit without Web Spider,System,0,,ip,Normal,JEREMYs Assets,,59 Days,CVE-2018-19824,http://nvd.nist.gov/vuln/detail/CVE-2018-19824,4.6,7.8,CVSS:3.0/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H,(AV:L/AC:L/Au:N/C:P/I:P/A:P),"In the Linux kernel through 4.19.6, a local user could exploit a use-after-free in the ALSA driver by supplying a malicious USB Sound device (with zero interfaces) that is mishandled in usb_audio_probe in sound/usb/card.c.",ubuntu-cve-2018-19824,Fail,"Vulnerable OS: Ubuntu Linux 16.04


Vulnerable software installed: Ubuntu linux-image-generic 4.4.0.141.147",2018-12-03,"UBUNTU:3879-1,UBUNTU:3879-2,UBUNTU:3930-1,UBUNTU:3930-2,UBUNTU:3931-1,UBUNTU:3931-2,UBUNTU:3933-1,UBUNTU:3933-2","https://usn.ubuntu.com/3879-1/,https://usn.ubuntu.com/3879-2/,https://usn.ubuntu.com/3930-1/,https://usn.ubuntu.com/3930-2/,https://usn.ubuntu.com/3931-1/,https://usn.ubuntu.com/3931-2/,https://usn.ubuntu.com/3933-1/,https://usn.ubuntu.com/3933-2/",221,5,"Canonical,Ubuntu Linux",2019-03-27,vv,Vulnerable Version,Ubuntu: (Multiple Advisories) (CVE-2018-19824): Linux kernel (Trusty HWE) vulnerabilities,2019-03-18

My code:

input {
  file {
    path => "/home/elastic/report44.csv"
    start_position => "beginning"
  }
#   stdin { }
}
filter {
  mutate {
   gsub => ["message", '(?!\B"[^"]*),(?![^"]*"\B)', "^", "message", '"', ""]
  }
  csv {
     separator => "^"
     skip_empty_rows => "true"
     skip_header => "true"
     autogenerate_column_names=>"true"
  }
}
output {
  stdout { codec => rubydebug }
}
#3

Please edit your post. Select the configuration and click on </> in the toolbar above the edit panel. You should then see it displayed in the preview panel on the right like this

...

Then do the same for the sample data, then reply to this post and I will take a look.

#4

Logstash output:

{
      "column11" => "16.04",
      "column29" => "59 ",
      "column15" => "0",
      "column23" => "0",
      "column27" => "JEREMYs Assets",
       "column6" => nil,
    "@timestamp" => 2019-05-17T16:06:54.940Z,
       "column8" => "ubuntu",
       "column9" => "Ubuntu Linux",
      "column16" => nil,
      "column10" => "Ubuntu Linux",
      "column21" => "Full audit without Web Spider",
       "column7" => "00:50:56:A7:C3:C3",
      "column26" => "Normal",
       "column5" => "192.168.65.217",
      "column22" => "System",
      "column28" => nil,
          "host" => "elk",
      "column24" => nil,
      "column19" => nil,
      "column14" => nil,
      "column18" => "0",
       "message" => "^^^412^192.168.65.217^^00:50:56:A7:C3:C3^ubuntu^Ubuntu Linux^Ubuntu Linux^16.04^^2,816^^0^^^0^^308^Full audit without Web Spider^System^0^^ip^Normal^JE
REMYs Assets^^59 ",
          "path" => "/home/elastic/report44.csv",
      "column17" => nil,
      "column12" => nil,
      "column25" => "ip",
       "column1" => nil,
       "column2" => nil,
       "column4" => "412",
       "column3" => nil,
      "@version" => "1",
      "column13" => "2,816",
      "column20" => "308"
}
{
          "host" => "elk",
       "message" => "Days^CVE-2018-19824^http://nvd.nist.gov/vuln/detail/CVE-2018-19824^4.6^7.8^CVSS:3.0/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H^(AV:L/AC:L/Au:N/C:P/I:P/A:P)^In
 the Linux kernel through ",
          "path" => "/home/elastic/report44.csv",
       "column6" => "CVSS:3.0/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H",
    "@timestamp" => 2019-05-17T16:06:54.983Z,
       "column8" => "In the Linux kernel through ",
       "column1" => "Days",
       "column2" => "CVE-2018-19824",
       "column4" => "4.6",
       "column7" => "(AV:L/AC:L/Au:N/C:P/I:P/A:P)",
       "column3" => "http://nvd.nist.gov/vuln/detail/CVE-2018-19824",
      "@version" => "1",
       "column5" => "7.8"
}
{
          "host" => "elk",
       "column1" => "4.19.6",
       "column2" => " a local user could exploit a use-after-free in the ALSA driver by supplying a malicious USB Sound device (with zero interfaces) that is mishandled in
usb_audio_probe in ",
       "message" => "4.19.6^ a local user could exploit a use-after-free in the ALSA driver by supplying a malicious USB Sound device (with zero interfaces) that is mishand
led in usb_audio_probe in ",
          "path" => "/home/elastic/report44.csv",
      "@version" => "1",
    "@timestamp" => 2019-05-17T16:06:54.983Z
}
{
          "host" => "elk",
       "column1" => "sound/usb/card.c.",
       "column2" => "ubuntu-cve-2018-19824",
       "message" => "sound/usb/card.c.^ubuntu-cve-2018-19824^Fail^Vulnerable OS: Ubuntu Linux 16.04 Vulnerable software installed: Ubuntu linux-image-generic ",
          "path" => "/home/elastic/report44.csv",
       "column4" => "Vulnerable OS: Ubuntu Linux 16.04 Vulnerable software installed: Ubuntu linux-image-generic ",
       "column3" => "Fail",
      "@version" => "1",
    "@timestamp" => 2019-05-17T16:06:54.984Z
}
{
          "host" => "elk",
       "message" => "4.4.0.141.147^2018-12-03^UBUNTU:3879-1,UBUNTU:3879-2,UBUNTU:3930-1,UBUNTU:3930-2,UBUNTU:3931-1,UBUNTU:3931-2,UBUNTU:3933-1,UBUNTU:3933-2^https://usn.ub
untu.com/3879-1/,https://usn.ubuntu.com/3879-2/,https://usn.ubuntu.com/3930-1/,https://usn.ubuntu.com/3930-2/,https://usn.ubuntu.com/3931-1/,https://usn.ubuntu.com/3931-2/,https://usn.ubu
ntu.com/3933-1/,https://usn.ubuntu.com/3933-2/^221^5^Canonical^Ubuntu ",
          "path" => "/home/elastic/report44.csv",
       "column6" => "5",
    "@timestamp" => 2019-05-17T16:06:54.984Z,
       "column8" => "Ubuntu ",
       "column1" => "4.4.0.141.147",
       "column2" => "2018-12-03",
       "column4" => "https://usn.ubuntu.com/3879-1/,https://usn.ubuntu.com/3879-2/,https://usn.ubuntu.com/3930-1/,https://usn.ubuntu.com/3930-2/,https://usn.ubuntu.com/3931
-1/,https://usn.ubuntu.com/3931-2/,https://usn.ubuntu.com/3933-1/,https://usn.ubuntu.com/3933-2/",
       "column7" => "Canonical",
       "column3" => "UBUNTU:3879-1,UBUNTU:3879-2,UBUNTU:3930-1,UBUNTU:3930-2,UBUNTU:3931-1,UBUNTU:3931-2,UBUNTU:3933-1,UBUNTU:3933-2",
      "@version" => "1",
       "column5" => "221"
}
{
          "host" => "elk",
       "message" => "Linux^2019-03-27^vv^Vulnerable Version^Ubuntu: (Multiple Advisories) (CVE-2018-19824): Linux kernel (Trusty HWE) vulnerabilities^2019-03-18 ",
          "path" => "/home/elastic/report44.csv",
       "column6" => "2019-03-18 ",
    "@timestamp" => 2019-05-17T16:06:54.989Z,
       "column1" => "Linux",
       "column2" => "2019-03-27",
       "column4" => "Vulnerable Version",
       "column3" => "vv",
      "@version" => "1",
       "column5" => "Ubuntu: (Multiple Advisories) (CVE-2018-19824): Linux kernel (Trusty HWE) vulnerabilities"
}
#5

Post updated.

#6

Not sure I can help. I think you might be attacking the wrong problem. If you can find a way to join those two lines then a simple

csv {}

will parse it in to 50 columns.