Hi All,
I am trying to migrate the XML logs to JSON with Logstash, but the parsing fails due to some unnecessary XML logs. The <exception>
tag is not needed and I attempted to remove it using remove_tag
, but it is not working.
Added this in config
remove_tag => ["exception"]
<?xml version="1.0" encoding="UTF-8"?>
<logger>
<log>
<server>
<exception name="Address already in use (Bind failed)">
java.net.BindException: Address already in use (Bind failed)
java.net.ServerSocket.<init>(ServerSocket.java:237)
</exception>
</server>
</log>
</logger>
exception:
:exception=>#<REXML::ParseException: Missing end tag for 'init' (got 'exception')
this is the config:
input {
file {
mode => "read"
path => "log11.xml"
sincedb_path => "nul"
start_position => "beginning"
type => "xml"
codec => multiline {
pattern => "^logger>"
negate => true
what => "previous"
}
}
}
filter {
xml {
remove_tag => ["exception"]
source => "message"
xpath => []
target => "xml_value"
}
mutate {
remove_field => [tags, host, message, xml_value]
}
}
output {
stdout {
codec => json
}
}
That init "tag" is never terminated. For this specific case you could use
mutate { gsub => [ "message", "<init>", "<init>" ] }
It's going to be a lot of work to generalize that.
To answer your actual question ... remove_tag will not work. It remove entries from the [tags] field (an array) if the filter successfully parses the XML. However, it will never be applied if there is an XML parse error, and even if it were it would not remove the <exception> element.
You can fix this using
mutate { gsub => [ "message", "
", "" ] }
mutate { gsub => [ "message", "<exception .*</exception>", "" ] }
Yes, that is a literal newline embedded inside the string that the first mutate is trying to match.
Note that once you remove the <exception> element there is nothing left in the <logger> element so you will end up with
"xml_value" => nil,
Thanks @Badger for your input.
<exception>
is just one of the tags in my XML log file.
I tried replacing <init>
with an empty string, but it is not working.
Here is my updated Logstash configuration:
input {
file {
mode => "read"
path => "log11.xml"
sincedb_path => "nul"
start_position => "beginning"
type => "xml"
codec => multiline {
pattern => "^logger>"
negate => true
what => "previous"
}
}
}
filter {
xml {
source => "message"
xpath => []
target => "xml_value"
}
mutate {
gsub => ["message", "<init>", ""]
}
}
output {
stdout {
codec => json
}
}
Even I tried both expressions which you mention
mutate { gsub => [ "message", "<init>", "<init>" ] }
mutate { gsub => [ "message", "
", "" ] }
mutate { gsub => [ "message", "<exception .*</exception>", "" ] }
But still same exception is coming.
:exception=>#<REXML::ParseException: Missing end tag for 'init' (got 'exception')
The exception is occurring when the xml parser hits the string in the midst of the XML. You need to remove it before trying to parse it.
Move the mutate+gsub to come before the xml filter.