I'm trying to use Logstash to ingest some old, OLD IAS/RRAS logs that seem to be customized. They all begin the same, so I can get dissect to ingest this correctly:
And I've read docs and found how I can use %{?key},%{&key} to consume a key=value pair that is split like that... but considering the logs seem to be customized, they may not always have the same number of key=value fields.
I haven't found anything that says I CAN do something for this, like mixing regex to make a %{?key},%{&key}+, which would indicate the pair can happen more than once. I did see the Dissect does not use regular expressions warning on the docs, though. Is there any way to achieve this?
((edit)) still searching, I found this answer... makes me wonder if this would be the way to go...
Dissect is designed to be fast, and that is achieved by having it support a reasonably small number of rules that allow a single pass of the data. I therefore do not think what you are asking for is possible. I would recommend parsing as much as possible using dissect and the use some other filter for the rest.
I'm far away from work now, but since it's still in my head... Yeah, I'm figuring out that maybe I should dissect only the first few fields and treat the keys and values in some other way. Maybe try my hand at Ruby, doesn't seem to hard, even for a non-programmer like me!
I did, but haven't found anything on documentation that would indicate I can use it. File format is only commas, there no other separator character. So I have things like key1,value1,key2,value2,key3,value3
Yeah, and as I've said on the initial post, there can be a variable number of kv pairs... Guess I should go for a Ruby solution. Once I get it working I'll post my solution, who knows if anyone else out there will ever need it
@Badger Looks like I messed up something, forgot about servername value... Sorry about that! Code works =D now to remove that pesky \r at the end of values...
dissect {
mapping => {
"message" => "%{NASIPAddress},%{UserName},%{log_timestamp},%{+log_timestamp},%{ServiceType},%{ServerName},%{values}"
}
}
ruby {
code => '
a = event.get("values").split(",")
h = {}
while a.length > 1
h = h.merge( [a.shift(2)].to_h )
end
event.set("keyvalues", h)
'
}
mutate {
remove_field => ["values"]
}
((edit)) And now I came up with a solution that doesn't use ruby!
Only problem left to solve is translate the keys from numbers to their descriptions... i.e: key 4 is NAS-IP-Address, and I'd much rather have "NAS-IP-Address=ip" than "4=ip", right?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.