RegEx in Logstash Conf File: loop through/retrieve all matches

Let's suppose my log messages are as following:

GroupName::Cars: mileage=19274 ~ year=2000
GroupName::school: students = 10000 ~ location = USA ~ classes = 50 ~ staff = 75

How can I pattern match both of these lines using the same RegEx pattern in the logstash configuration file? Is there a way to continue pattern matching until some unspecified number of "~"?

I know that it is possible to create two separate grok filters as below, but wondering if there's a cleaner way to collapse the two pattern matches into one

if [message] =~ "Cars" {
grok {
    match => {
	        "message" => "%{GREEDYDATA}%{KEYWORD}=%{MILEAGE:mileage} ~ %{KEYWORD}=%{YEAR:year}"
    }
    pattern_definitions => {
        "KEYWORD" => "[\w]{4,7}"
        "MILEAGE" => "[\d]{1,10}"
        "YEAR" => "[\d]{4}"
    }
}
 }

if [message] =~ "School" {
    grok {
        match => {
	        "message" => "%{GREEDYDATA}%{KEYWORD} = %{STUDENTS:students}  ~  %{KEYWORD} = %{LOCATION:location}  ~  %{KEYWORD} = %{CLASSES:classes}  ~  %{KEYWORD} = %{STAFF:staff}"
        }
        pattern_definitions => {
            "KEYWORD" => "[\w]{4,7}"
            "STUDENTS" => "[\d]{1,10}"
            "LOCATION" => "[\w]{1,10}"
            "CLASSES" => "[\d]{1,10}"
            "STAFF" => "[\d]{1,10}"
        }
    }
}

If you really want to match a single pattern you could use alternation, but it doesn't make much sense to me.

grok { match => { "message" => "(pattern1|pattern2)" } }

I would combine the two groks into one and match [message] against an array of patterns...

grok {
    match => {
        "message" => [
                "%{KEYWORD}=%{MILEAGE:mileage} ~ %{KEYWORD}=%{YEAR:year}",
                "%{KEYWORD} = %{STUDENTS:students}  ~  %{KEYWORD} = %{LOCATION:location}  ~  %{KEYWORD} = %{CLASSES:classes}  ~  %{KEYWORD} = %{STAFF:staff}"
        ]
        ...

Note that the leading %{GREEDYDATA} is not needed. The patterns are not anchored, so they do not have to match from the beginning of the field.

Thanks! Let's say that the log messages may have new GroupNames in the future with an unspecified number of parameters. All I know is that the messages will be of this format:
GroupName:: <parameter_1 name> = <parameter_1 value> ~ <parameter_2 name> = <parameter_2 value> ~ <parameter_n name> = <parameter_n value>

What would be a good way to pattern match?

Something like this :

%{DATA:data}::%{WORD:category}:%{SPACE}%{WORD:keyword}=%{NUMBER:value}?(%{SPACE} ~ %{WORD:keyword}=%{NUMBER:value})*

The ?(...)* means it's optional I'm assuming? So if I had up to 5 parameters I would do something like this?

%{DATA:data}::%{WORD:category}:%{SPACE}?(%{SPACE} ~ %{WORD:keyword}=%{NUMBER:value})*?(%{SPACE} ~ %{WORD:keyword}=%{NUMBER:value})*?(%{SPACE} ~ %{WORD:keyword}=%{NUMBER:value})*?(%{SPACE} ~ %{WORD:keyword}=%{NUMBER:value})*?(%{SPACE} ~ %{WORD:keyword}=%{NUMBER:value})*

'*' is like in bash regexp : zero or more
? is for optional

For example :

GroupName::Cars: mileage=19274 ~ year=2000

{
  "data": [
[
  "GroupName"
]
  ],
  "category": [
[
  "Cars"
]
  ],
  "keyword": [
[
  "mileage",
  "year"
]
  ],
  "value": [
[
  "19274",
  "2000"
]
  ]
}

Use mutate+gsub to remove the prefix then use a kv filter.

What's the correct syntax for the statement below?

if [path] =~ "a" OR [path] =~ "b"?

I want to perform the same filter if path contains two particular keywords. Rather than copy/paste filter, I want to collapse it into one. So instead of

if  [path] =~ "a" {
    // some filter
} else if [path] =~ "a" {
    // same filter
}

I want
if [path] =~ "a" OR [path] =~ "b" {
// some filter
}

Lower case or.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.