Can dissect use a variable number of fields?

I'm trying to use Logstash to ingest some old, OLD IAS/RRAS logs that seem to be customized. They all begin the same, so I can get dissect to ingest this correctly:

filter {
	dissect {
		mapping => {
			"message" => "%{ServerName},%{UserName},%{log_timestamp},%{+log_timestamp},%{ServiceType}"
		}
	}
}

And I've read docs and found how I can use %{?key},%{&key} to consume a key=value pair that is split like that... but considering the logs seem to be customized, they may not always have the same number of key=value fields.

I haven't found anything that says I CAN do something for this, like mixing regex to make a %{?key},%{&key}+, which would indicate the pair can happen more than once. I did see the Dissect does not use regular expressions warning on the docs, though. Is there any way to achieve this?

((edit)) still searching, I found this answer... makes me wonder if this would be the way to go...

Dissect is designed to be fast, and that is achieved by having it support a reasonably small number of rules that allow a single pass of the data. I therefore do not think what you are asking for is possible. I would recommend parsing as much as possible using dissect and the use some other filter for the rest.

1 Like

I'm far away from work now, but since it's still in my head... Yeah, I'm figuring out that maybe I should dissect only the first few fields and treat the keys and values in some other way. Maybe try my hand at Ruby, doesn't seem to hard, even for a non-programmer like me!

Have you considered the kv filter?

I did, but haven't found anything on documentation that would indicate I can use it. File format is only commas, there no other separator character. So I have things like key1,value1,key2,value2,key3,value3

Aha. Then it will likely not work.

Yeah, and as I've said on the initial post, there can be a variable number of kv pairs... Guess I should go for a Ruby solution. Once I get it working I'll post my solution, who knows if anyone else out there will ever need it :grin:

    ruby {
        code => '
            a = event.get("message").split(",")
            h = {}
            while a.length > 1
                h = h.merge( [a.shift(2)].to_h )
            end
            event.set("someField", h)
        '
    }

will turn that into

 "someField" => {
    "key1" => "value1",
    "key3" => "value3",
    "key2" => "value2"
}
1 Like

AHHHH That's awesome, @Badger!! Thanks!
Now I can just dissect the first few static fields and ruby the rest :heart:

@Badger Looks like I messed up something, forgot about servername value... Sorry about that! Code works =D now to remove that pesky \r at the end of values...

	dissect {
		mapping => {
			"message" => "%{NASIPAddress},%{UserName},%{log_timestamp},%{+log_timestamp},%{ServiceType},%{ServerName},%{values}"
		}
	}

	ruby {
		code => '
			a = event.get("values").split(",")
			h = {}
			while a.length > 1
				h = h.merge( [a.shift(2)].to_h )
			end
			event.set("keyvalues", h)
		'
	}

	mutate {
		remove_field => ["values"]
	}

((edit)) And now I came up with a solution that doesn't use ruby!

	dissect {
		mapping => {
			"message" => "%{NASIPAddress},%{UserName},%{log_timestamp},%{+log_timestamp},%{ServiceType},%{ServerName},%{values}"
		}
	}

	mutate {
		gsub => [
			"values", "([^,]+),([^,]+),?", "\1=\2,"
		]
	}

	kv {
		field_split => ","
		source => "values"
	}

Only problem left to solve is translate the keys from numbers to their descriptions... i.e: key 4 is NAS-IP-Address, and I'd much rather have "NAS-IP-Address=ip" than "4=ip", right?

Final configuration file! Thanks @Badger and @Christian_Dahlqvist

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.