I'm using the xml
filter plugin in Logstash to parse a XML document, and am having success with the exception of the @timestamp
value, which is found as an attribute value. Here is a snippet of the XML file:
<cdf:Benchmark resolved="1" style="SCAP_1.2">
<cdf:TestResult start-time="2022-01-17T11:15:01" end-time="2022-01-17T11:15:37">
and here is my full pipline config (the last line of the filter is the problem):
input {
file {
path => [ "C:/temp/SCAP/*.xml" ]
start_position => "beginning"
codec => multiline {
pattern => "^ZsExDrC"
what => "previous"
negate => true
auto_flush_interval => 2
max_lines => 50000
}
}
}
filter {
xml {
source => "message"
target => "doc"
xpath => [ "/cdf:Benchmark/cdf:title/text()", "benchmark",
"/cdf:Benchmark/cdf:plain-text[@id='release-info']/text()", "release-info",
"/cdf:Benchmark/cdf:Value[1]/cdf:title/text()", "setting-title",
"/cdf:Benchmark/cdf:TestResult/cdf:score[1]/text()", "vulnerability.score.base",
"/cdf:Benchmark/cdf:TestResult/cdf:target/text()", "host.name",
"/cdf:Benchmark/cdf:TestResult/cdf:target-facts/cdf:fact[@name='urn:scap:fact:asset:identifier:os_name']/text()", "host.os.name",
"/cdf:Benchmark/cdf:TestResult/cdf:target-address[normalize-space()][1]/text()", "host.ip",
"/cdf:Benchmark/cdf:TestResult/@start-time", "@timestamp"]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "scap-results-%{+YYYY.MM.dd}"
}
}
The xpath is working correctly, but it doesn't identify that element attribute as a date field because I get the following output from the xpath: start-time=2022-01-17T11:15:01
instead of 2022-01-17T11:15:01
. In other words, I don't seem to be able to select just the attribute value without the attribute name.
I tried adding /text()
to the end like this:
/cdf:Benchmark/cdf:TestResult/@start-time/text()`
but then the xpath fails because this isn't proper xpath syntax.