I am using Logstash 7.6.1 to ingest an XML file that was generated using a standard Microsoft Windows tool call WinInet Trace. Although I am able to ingest the entire XML, I want to generate some hashes from the values in the XML tree and I've been trying to use xpath.
The data file looks like this:
<Events>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Guid="{9e814aad-3204-11d2-9a82-006008a86939}" />
<EventID>0</EventID>
<Version>2</Version>
<Level>0</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x0</Keywords>
<TimeCreated SystemTime="2020-03-27T14:00:48.756779200+00:00" />
<Correlation ActivityID="{00000000-0000-0000-0000-000000000000}" />
<Execution ProcessID="7500" ThreadID="7464" ProcessorID="0" KernelTime="90" UserTime="30" />
<Channel />
<Computer />
</System>
<EventData>
<Data Name="BufferSize"> 8192</Data>
<Data Name="Version">83951878</Data>
<Data Name="ProviderVersion"> 7601</Data>
<Data Name="NumberOfProcessors"> 2</Data>
<Data Name="EndTime">132298058270657554</Data>
<Data Name="TimerResolution"> 156001</Data>
<Data Name="MaxFileSize"> 0</Data>
<Data Name="LogFileMode">0x0</Data>
<Data Name="BuffersWritten"> 17696</Data>
<Data Name="StartBuffers"> 1</Data>
<Data Name="PointerSize"> 8</Data>
<Data Name="EventsLost"> 1</Data>
<Data Name="CPUSpeed"> 2400</Data>
<Data Name="LoggerName">0x5</Data>
<Data Name="LogFileName">0x7</Data>
<Data Name="BootTime">132297485993751998</Data>
<Data Name="PerfFreq">10000000</Data>
<Data Name="StartTime">132297912487567792</Data>
<Data Name="ReservedFlags">0x1</Data>
<Data Name="BuffersLost"> 0</Data>
<Data Name="SessionNameString">wininettrace</Data>
<Data Name="LogFileNameString">C:\Temp\wininettrace.etl</Data>
</EventData>
<RenderingInfo Culture="en-GB">
<Opcode>Header</Opcode>
<Provider>MSNT_SystemTrace</Provider>
<EventName xmlns="http://schemas.microsoft.com/win/2004/08/events/trace">EventTrace</EventName>
</RenderingInfo>
<ExtendedTracingInfo xmlns="http://schemas.microsoft.com/win/2004/08/events/trace">
<EventGuid>{68fdd900-4a3e-11d1-84f4-0000f80464e3}</EventGuid>
</ExtendedTracingInfo>
</Event>
The configuration file is this:
input {
file {
path => "c:/traces/pcbd/20200327/sample.xml"
start_position => "beginning"
sincedb_path => "NUL"
type => "xml"
codec => multiline {
pattern => "<Event "
negate => true
what => "previous"
auto_flush_interval => 1
}
}
}
filter {
xml {
source => "message"
target => "wininet"
store_xml => true
xpath => [ "//Event/System/EventID/text()", "System.EventID" ]
}
}
filter {
mutate { remove_field => [ "message" ] }
}
output {
elasticsearch {
hosts => "localhost"
index => "pcbd"
}
stdout {
codec => rubydebug
}
}
The records get ingested with all of the XML content in the JSON, but the System.EventID hash isn't generated.
I spent ages looking for the answer to this, and I'll post the answer next.