XML filter - Not parsing xmls with version tag


(Tag V) #1

XML filter not able to parse message if it contains xml version.

Logstash able to parse xml perfectly with message:

<Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task"><RegistrationInfo> <Date>2007-12-14T13:46:47.34375</Date> <Author>WIN-R9H529RIO4Y\Administrator</Author> <Description>Kick of night jobs</Description></RegistrationInfo></Task>

But Logstash not able to parse xml message:

**<?xml version="1.0" encoding="UTF-16"?>**<Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task"><RegistrationInfo> <Date>2007-12-14T13:46:47.34375</Date> <Author>WIN-R9H529RIO4Y\Administrator</Author> <Description>Kick of night jobs</Description></RegistrationInfo></Task>

How to handle this with logstash xml filter?

my logstash conf:

    xml {
        remove_namespaces => "true"
        source => "message"
        target => "@metadata[xml_content]"
        force_array => false
    }
    ruby {
        code => '
            event.get("@metadata[xml_content]").each do |key, value|
               event.set(key.downcase, value)
            end
        '
    }

Thanks in advance.


(Magnus Bäck) #2

What's the error message?


(Magnus Bäck) #4

I don't believe that if the XML filter fails to parse a document and doesn't log an error message about it. Look again. If nothing turns up produce a complete recipe for reproducing the problem.


(Tag V) #5

Ya I am getting error.

Error parsing xml with XmlSimple {:source=>"message", :value=>"<?xml version=\"1.0\" encoding=\"UTF-16\"?><Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task"> 2007-12-14T13:46:47.34375 WIN-R9H529RIO4Y\Administrator Kick of night jobs\r", :exception=>#<Encoding::InvalidByteSequenceError: "">"" on UTF-16>,

Please find image.


(Magnus Bäck) #6

Please don't post screenshots when you can copy/paste the text.

Is that file really UTF-16? If it were I'd assume you'd have to adjust the Logstash codec's charset option so it treats the file as UTF-16 rather than UTF-8. I suspect the file might be UTF-8 so things work without <?xml version="1.0" encoding="UTF-16"?>at the top.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.