Assistance with Grok and regex

JeremyP · July 14, 2022, 3:52am

Hello,

I'm attempting to pull the name of a software package from a CPE from NIST. This is my sample data:

cpe:2.3:a:libexpat_project:libexpat:*:*:*:*:*:*:*:*

With regular regex the following expression matches the string between the 4th and 5th colon just fine, however, using Grok within Logstash using the round brackets it unfortunately no longer matches what I want. From all the examples I've seen, round brackets are required.

Code:

grok {
  match => { "[software][cpe]" => "(?<[software][name]>^(?:[^:]+:){4}\K[^:]+)"
}

Output:

{
  "[software][name]": "cpe:2.3:a:libexpat_project:libexpat"
}

Desired Output:

{
  "[software][name]": "libexpat"
}

I'd appreciate some guidance.

Thanks.

JeremyP · July 14, 2022, 2:48pm

It's possible I've answered my own question....

%{WORD}[:]%{BASE10NUM}[:]%{WORD}[:]%{WORD}[:]%{WORD:[software][name]}

Rios · July 15, 2022, 7:00am

You can use CSV plugin as well, separator ":"

If you want to have only "libexpat" from the field "libexpat_project" you can use:

split by _ and use the firstpart, if you don't know what is behind underscore
gsub and replace "_project" with "", if you know what is behind underscore

system · August 12, 2022, 7:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.