Extract information from field

saisimo02 · August 21, 2018, 11:25am

Hello,
I am using Logstash to parse an XML file, I have something like that:

<Value Obj="SPM=med48610,RGN=region1,AZ=zone1,VCM=med-4861-0-storage-vm0,Link=eth1">

I am looking for a way to get a new field with SPM which contains the username and another field to indicate the number of my VM vm0 and also the type storage (VCM=med-4861-0-storage-vm0)
For the moment I can just get one field that contains all the Obj

Thanks for help

magnusbaeck · August 21, 2018, 11:30am

Use a kv filter to parse the field containing the "SPM=med48610,RGN=region1,AZ=zone1,VCM=med-4861-0-storage-vm0,Link=eth1" string.

saisimo02 · August 21, 2018, 11:37am

I have no idea how I can use the kv filter, which option can I use? thanks
I think it's something like that

kv { field_split => "=" }

But How can I indicate that I am trying to split the "SPM=med48610,RGN=region1,AZ=zone1,VCM=med-4861-0-storage-vm0,Link=eth1"

Because my XML contains a lot of others elements

magnusbaeck · August 21, 2018, 12:32pm

Set the kv filter's source option to the name of the field containing the string. This probably works:

kv {
  source => "some-fieldname"
  field_split => ","
}

saisimo02 · August 21, 2018, 3:08pm

Thank you Magnus, but still have a problem how I can I get a field that contains only vm0 like my example but it could be vm1... and another field with storage (and I can have different element depending on my Obj)
I would like to use that to filter my data when I use Kibana by adding a DropDown so the user can only choose the number of vm and also the type (storage or something else)
Thank you again, and waiting for your suggestions

saisimo02 · August 21, 2018, 3:13pm

In fact uising your solution I will get only a field VCM, and unfortunately sometimes this VMC is VFM or another name. I though that there is a way to look for string vm in the "SPM=med48610,RGN=region1,AZ=zone1,VCM=med-4861-0-storage-vm0,Link=eth1" and then add only vm0 (before ',') and for the type I know that there is only 3 types, so using a condition (if) I can store the appropriate type.

rijinmp · August 21, 2018, 3:38pm

Try Grok Filter

grok {
match => { "message" => '<%{DATA:Info}"SPM=%{DATA:SPM},RGN=%{DATA:RGN},AZ=%{DATA:AZ},VCM=%{DATA:VCM}-%{DATA:VCM}-%{DATA:VCM}-%{DATA:VCM}-%{DATA:VM},Link=%{GREEDYDATA:Link}">'}

}

rijinmp · August 21, 2018, 3:40pm

Output will be like this

{
"Info": [
[
"Value Obj="
]
],
"SPM": [
[
"med48610"
]
],
"RGN": [
[
"region1"
]
],
"AZ": [
[
"zone1"
]
],
"VCM": [
[
"med",
"4861",
"0",
"storage"
]
],
"VM": [
[
"vm0"
]
],
"Link": [
[
"eth1"
]
]
}

saisimo02 · August 21, 2018, 3:50pm

Thank you, your "message" is "fieldname" ?

rijinmp · August 21, 2018, 3:54pm

No .. Message is a keyword . Its part of the grok filter . Fields are inside the message.

grok {
match => { "message" => ' Here we enter the parsing pattern and fields '}

}

rijinmp · August 21, 2018, 3:58pm

Field names are

Info , SPM ,RGN, AZ, VCM ,VM ,Link

And SPM field's value is med48610 , VM field's value is vm0

magnusbaeck · August 21, 2018, 4:46pm

match => { "message" => '<%{DATA:Info}"SPM=%{DATA:SPM},RGN=%{DATA:RGN},AZ=%{DATA:AZ},VCM=%{DATA:VCM}-%{DATA:VCM}-%{DATA:VCM}-%{DATA:VCM}-%{DATA:VM},Link=%{GREEDYDATA:Link}">'}

This grok expression is extremely inefficient. All occurrences of DATA and GREEDYDATA should be replaced with more exact patterns.

And I still think the kv-based solution is better, and the issue with the VCM field can be solved with an additional grok or dissect filter that only looks at the VCM value.

saisimo02 · August 21, 2018, 6:18pm

Yes kv is working very good but still having a problem with VCM and as I mentioned this name could be different from an xml to another, so it's hard to do it for each file. That's why I thought that adding if could be a solution. But not really sure is the best way.

magnusbaeck · August 21, 2018, 6:52pm

Using conditionals to run different filters depending on which fields are present sounds like an okay idea as long as the number of possible field names is reasonably small.

system · September 18, 2018, 6:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parsing XML document contained in a field Logstash	5	401	July 12, 2021
Logstash Filter Use Logstash	5	462	November 27, 2018
How to extract and transform unstructured data into fields Logstash	6	864	August 3, 2017
XML filter in logstash Logstash	12	1274	October 30, 2020
Logstash Parsing - kv filter Logstash	3	925	December 18, 2017

Extract information from field

Related topics