Parse Error while parsing JUnit XML

I am trying to parse a JUnit XML and I always get the _xmlparsefailure stating the reason as
Preformatted text:exception=>#<REXML::ParseException: Declarations can only occur in the doctype declaration.

The below configuration works for other JUnit XML's however certain XML have a parse failure.

# logstash.conf

input {
  beats {
    port => 5044
  }
}

filter {

  xml {
     source => "message"
     store_xml => true 
     force_content => true
     target => "parsed"
  }
}
  
output {
  stdout { codec => rubydebug }
}

Sample JUnit XML that fails is available here - https://pastebin.com/CdbVtZ0n
Note: I have noticed that when I remove the data in <system-out> before sending in the file, it works as it should. I think somehow the XML filter is trying to parse the data inside the CDATA which it should not.

I had also tried to remove the system-out tag before parsing with gsub filter as below, this would however only successfully remove in the case of a successfully parsed xml. For the XML above, the parse would still fail.

  mutate {
    gsub => [
      "message", "<system-out>[\s\S]*?<\/system-out>", "<replaced-sys-ot>M</replaced-sys-ot>"
    ]
  }

Any help is greatly appreciated! Thank you :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.