Logstash mutate split but maintain the separator character


#1

I want to split a field into an array, but I want the array values to keep the separator characters. It seems that the way mutate split works is to remove the separator characters.

Is there another way to accomplish this?

So, if my field has a value of:

linux-image-4.4.0-1054-aws:amd64 (4.4.0-1054.63, automatic), linux-headers-4.4.0-1054-aws:amd64 (4.4.0-1054.63, automatic), linux-aws-headers-4.4.0-1054:amd64 (4.4.0-1054.63, automatic)

I want to separate this by matching "), " so that I would get an array like this:

linux-image-4.4.0-1054-aws:amd64 (4.4.0-1054.63, automatic), 
linux-headers-4.4.0-1054-aws:amd64 (4.4.0-1054.63, automatic), 
linux-aws-headers-4.4.0-1054:amd64 (4.4.0-1054.63, automatic)

But using mutate split I get:

linux-image-4.4.0-1054-aws:amd64 (4.4.0-1054.63, automatic, 
linux-headers-4.4.0-1054-aws:amd64 (4.4.0-1054.63, automatic, 
linux-aws-headers-4.4.0-1054:amd64 (4.4.0-1054.63, automatic

(missing the closing parenthesis - which I need for future grok parsing)

Any suggestions on any method of accomplishing this would be huge!


(Ry Biesemeyer) #2

Since the mutate filter applies gsub directives before split directives, it is possible to use a positive-lookbehind assertion to inject a character on which we can later split:

  • pattern: "(?<=\)), " a comma-space sequence that is preceeded by a literal closing paren
  • replacement: "|" a pipe character (whatever sequence you use MUST NOT appear naturally in your messages)
filter {
	mutate {
	    # replace any comma-space that is preceeded by a closing paren with a pipe
		gsub => ["message", "(?<=\)), ", "|"]
		# split on the pipe
		split => { "message" => "|" }
	}
}

With the input you gave as the message on Logstash 6.2.2, the above filter gave me output that is likely what you expect:

{
       "message" => [
        [0] "linux-image-4.4.0-1054-aws:amd64 (4.4.0-1054.63, automatic)",
        [1] "linux-headers-4.4.0-1054-aws:amd64 (4.4.0-1054.63, automatic)",
        [2] "linux-aws-headers-4.4.0-1054:amd64 (4.4.0-1054.63, automatic)"
    ],
    "@timestamp" => 2018-05-03T00:00:18.026Z,
      "@version" => "1",
          "host" => "castrovel.local"
}

#3

Dude! Yes! This works perfectly.

Thank you for your help. I appreciate it.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.