Assistance with Grok and regex

Hello,

I'm attempting to pull the name of a software package from a CPE from NIST. This is my sample data:

cpe:2.3:a:libexpat_project:libexpat:*:*:*:*:*:*:*:*

With regular regex the following expression matches the string between the 4th and 5th colon just fine, however, using Grok within Logstash using the round brackets it unfortunately no longer matches what I want. From all the examples I've seen, round brackets are required.

Code:

grok {
  match => { "[software][cpe]" => "(?<[software][name]>^(?:[^:]+:){4}\K[^:]+)"
}

Output:

{
  "[software][name]": "cpe:2.3:a:libexpat_project:libexpat"
}

Desired Output:

{
  "[software][name]": "libexpat"
}

I'd appreciate some guidance.

Thanks.

It's possible I've answered my own question....

%{WORD}[:]%{BASE10NUM}[:]%{WORD}[:]%{WORD}[:]%{WORD:[software][name]}
2 Likes

You can use CSV plugin as well, separator ":"

If you want to have only "libexpat" from the field "libexpat_project" you can use:

  • split by _ and use the firstpart, if you don't know what is behind underscore
  • gsub and replace "_project" with "", if you know what is behind underscore

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.