XML format on rows


(Micke) #1

Hi,
I have an Microsoft NPS server that ouputs monthly logfiles with XML formatted rows that looks like this:

<Event><Timestamp data_type="4">10/17/2017 15:57:24.465</Timestamp><Computer-Name data_type="1">NPSSERVER01</Computer-Name><Event-Source data_type="1">IAS</Event-Source><Class data_type="1">311 1 192.168.1.12 05/13/2017 20:05:10 45069</Class><MS-Extended-Quarantine-State data_type="0">0</MS-Extended-Quarantine-State><MS-Quarantine-State data_type="0">0</MS-Quarantine-State><Client-IP-Address data_type="3">192.168.1.10</Client-IP-Address><Client-Vendor data_type="0">0</Client-Vendor><Client-Friendly-Name data_type="1">firewall.domain.com</Client-Friendly-Name><Proxy-Policy-Name data_type="1">PolicyName</Proxy-Policy-Name><Provider-Type data_type="0">1</Provider-Type><SAM-Account-Name data_type="1">DOMAIN\user1</SAM-Account-Name><Authentication-Type data_type="0">1</Authentication-Type><NP-Policy-Name data_type="1">FirewallAdmins</NP-Policy-Name><Quarantine-Update-Non-Compliant data_type="0">1</Quarantine-Update-Non-Compliant><Vendor-Specific data_type="2">220038A5010F5344697411682111646D696E73</Vendor-Specific><Vendor-Specific data_type="2">330038A501174163635F4E11135F426C7565436F61745F46756C6C</Vendor-Specific><Framed-Protocol data_type="0">1</Framed-Protocol><Service-Type data_type="0">2</Service-Type><MS-Link-Utilization-Threshold data_type="0">50</MS-Link-Utilization-Threshold><MS-Link-Drop-Time-Limit data_type="0">120</MS-Link-Drop-Time-Limit><Fully-Qualifed-User-Name data_type="1">domain.com/OU/IT/User1</Fully-Qualifed-User-Name><Packet-Type data_type="0">2</Packet-Type><Reason-Code data_type="0">0</Reason-Code></Event>

How can i format this so ES sees it correctly.
I use beat-agents / Logstash / ES


#2
filter {
  xml {
    source => "message"
    target => "theXML"
  }
}

Would be a start. What do you mean by "correctly"? :smiley:


(Micke) #3

By correctly i mean not as a chunk of data in the message field.
If i use the xml, will it be structured into fields automatically?

If so, can i Point to the xml data fields and manage them in the filter portion afterwards? for example sync the timestamp field with ES timestamp.


#4

Sure. I suggest you start off something like this and drop single lines into it to see what the structure looks like.

 input { stdin {} }
 output { stdout { codec => rubydebug } }
 
 filter {
   xml {
     source => "message"
     target => "theXML"
   }
 }

To use the timestamp as @timestamp you would use something like this. Note that everything is an array.

   date {
     match => [ "[theXML][Timestamp][0][content]", "MM/dd/YYYY HH:mm:ss.SSS" ]
     timezone => "Europe/Gibraltar"
   }

(Micke) #5

Hi,
The XML Data import worked fine, BUT...
Because the XML Looks like this:
< Timestamp data_type="4">10/18/2017 14:33:34.139< /Timestamp>
< Computer-Name data_type="1">SERVER1< /Computer-Name>
< Event-Source data_type="1">IAS< /Event-Source>

The fields looks like this:
Name: theXML.Timestamp
Value:
{
"data_type": "4",
"content": "10/18/2017 14:33:34.139"
}

Name: theXML.Computer-Name
Value:
{
"data_type": "1",
"content": "SERVER1"
}

Name: theXML.Event-Source
Value:
{
"data_type": "1",
"content": "IAS"
}

I am only interested in the content.
So i want theXML.Event-Source: IAS

Can i do that dynamically, without explicitly pointing to each content-subvalue?

Thanks


(Micke) #6

If i add:
force_array => false

Then all fields are split up like this:
theXML.Timestamp.content: 10/18/2017 14:33:34.139
theXML.Timestamp.data_type: 4

theXML.Computer-Name.content: SERVER1
theXML.Computer-Name.data_type: 1

Do i need to use XPath do extract the content data into for example theXML.Computer-Name.

Thanks


#7

You could do it with XPath, or you could do stuff like mutate { replace => { "EventSource" => "%{[TheXML][Event-Source][0][content]}" } }

I do not know if there is a way to globally replace [X] with [X][0][content] without knowing each individual value of X. Maybe a ruby function, but I don't know enough to write it.


(Micke) #8

Yeah, i saw that others have used replace and other methods. The thing is that the fields can vary from event to event. Sure, i could go through and replace on all fields that i know of. But it would be nice if i could grab the content from all fields right away. But maybe that cant be done with only the target configuration option.

Maybe not the most Beautiful way, but do an if statement if the name contains data_type, and then drop it.
And then some kind of replace on the .content to remove the content from the name?


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.