I am trying to parse a JUnit XML and I always get the _xmlparsefailure stating the reason as
Preformatted text:exception=>#<REXML::ParseException: Declarations can only occur in the doctype declaration.
The below configuration works for other JUnit XML's however certain XML have a parse failure.
# logstash.conf
input {
beats {
port => 5044
}
}
filter {
xml {
source => "message"
store_xml => true
force_content => true
target => "parsed"
}
}
output {
stdout { codec => rubydebug }
}
Sample JUnit XML that fails is available here - https://pastebin.com/CdbVtZ0n
Note: I have noticed that when I remove the data in <system-out>
before sending in the file, it works as it should. I think somehow the XML filter is trying to parse the data inside the CDATA which it should not.
I had also tried to remove the system-out tag before parsing with gsub filter as below, this would however only successfully remove in the case of a successfully parsed xml. For the XML above, the parse would still fail.
mutate {
gsub => [
"message", "<system-out>[\s\S]*?<\/system-out>", "<replaced-sys-ot>M</replaced-sys-ot>"
]
}
Any help is greatly appreciated! Thank you