Extract information from field

Hello,
I am using Logstash to parse an XML file, I have something like that:

<Value Obj="SPM=med48610,RGN=region1,AZ=zone1,VCM=med-4861-0-storage-vm0,Link=eth1">

I am looking for a way to get a new field with SPM which contains the username and another field to indicate the number of my VM vm0 and also the type storage (VCM=med-4861-0-storage-vm0)
For the moment I can just get one field that contains all the Obj

Thanks for help

Use a kv filter to parse the field containing the "SPM=med48610,RGN=region1,AZ=zone1,VCM=med-4861-0-storage-vm0,Link=eth1" string.

I have no idea how I can use the kv filter, which option can I use? thanks
I think it's something like that

kv { field_split => "=" }

But How can I indicate that I am trying to split the "SPM=med48610,RGN=region1,AZ=zone1,VCM=med-4861-0-storage-vm0,Link=eth1"

Because my XML contains a lot of others elements

Set the kv filter's source option to the name of the field containing the string. This probably works:

kv {
  source => "some-fieldname"
  field_split => ","
}
1 Like

Thank you Magnus, but still have a problem how I can I get a field that contains only vm0 like my example but it could be vm1... and another field with storage (and I can have different element depending on my Obj)
I would like to use that to filter my data when I use Kibana by adding a DropDown so the user can only choose the number of vm and also the type (storage or something else)
Thank you again, and waiting for your suggestions

In fact uising your solution I will get only a field VCM, and unfortunately sometimes this VMC is VFM or another name. I though that there is a way to look for string vm in the "SPM=med48610,RGN=region1,AZ=zone1,VCM=med-4861-0-storage-vm0,Link=eth1" and then add only vm0 (before ',') and for the type I know that there is only 3 types, so using a condition (if) I can store the appropriate type.

Try Grok Filter

grok {
match => { "message" => '<%{DATA:Info}"SPM=%{DATA:SPM},RGN=%{DATA:RGN},AZ=%{DATA:AZ},VCM=%{DATA:VCM}-%{DATA:VCM}-%{DATA:VCM}-%{DATA:VCM}-%{DATA:VM},Link=%{GREEDYDATA:Link}">'}

}

Output will be like this

{
"Info": [
[
"Value Obj="
]
],
"SPM": [
[
"med48610"
]
],
"RGN": [
[
"region1"
]
],
"AZ": [
[
"zone1"
]
],
"VCM": [
[
"med",
"4861",
"0",
"storage"
]
],
"VM": [
[
"vm0"
]
],
"Link": [
[
"eth1"
]
]
}

Thank you, your "message" is "fieldname" ?

No .. Message is a keyword . Its part of the grok filter . Fields are inside the message.

grok {
match => { "message" => ' Here we enter the parsing pattern and fields '}

}

1 Like

Field names are

Info , SPM ,RGN, AZ, VCM ,VM ,Link

And SPM field's value is med48610 , VM field's value is vm0

1 Like

match => { "message" => '<%{DATA:Info}"SPM=%{DATA:SPM},RGN=%{DATA:RGN},AZ=%{DATA:AZ},VCM=%{DATA:VCM}-%{DATA:VCM}-%{DATA:VCM}-%{DATA:VCM}-%{DATA:VM},Link=%{GREEDYDATA:Link}">'}

This grok expression is extremely inefficient. All occurrences of DATA and GREEDYDATA should be replaced with more exact patterns.

And I still think the kv-based solution is better, and the issue with the VCM field can be solved with an additional grok or dissect filter that only looks at the VCM value.

1 Like

Yes kv is working very good but still having a problem with VCM and as I mentioned this name could be different from an xml to another, so it's hard to do it for each file. That's why I thought that adding if could be a solution. But not really sure is the best way.

Using conditionals to run different filters depending on which fields are present sounds like an okay idea as long as the number of possible field names is reasonably small.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.