XML filter help

Hi,

I'm trying to import a log file which contains an XML message embedded in each line in to elasticsearch using logstash. I use KV filter to extract the XML message.

Then I'm trying to extract several values from this XML message. I have been trying this for hours now, but I think I'm missing some thing or I haven't understood the XML plugin well.

Therefore I need your help to figure this out.

An example XML message,

<?xml version="1.0" encoding="UTF-8"?>
<msgType1 xmlns="http://www.w1.org/" xmlns:ds="http://www.w3.org" xmlns:trlh="http://www.w2.com" Version="5-8">
    <header>
        <messageId messageIdScheme="http://www.w4.com">12345</messageId>
        <sentBy>w4</sentBy>
        <sendTo>abc</sendTo>
        <creationTimestamp>20160516</creationTimestamp>
    </header>
    <pCId correlationIdScheme="http://www.w4.com">3456</pCId>
    <CId CIdScheme="http://www.w4.com">1233</CId>
    <sequenceNumber>1</sequenceNumber>
    <party id="party1">
        <partyId>test1</partyId>
    </party>
    <party id="party2">
        <partyId>test2</partyId>
    </party>
</msgType1>

My Filter,

xml {
        store_xml => false
        source => "src"
        target => "msgType1"
        xpath => [ "/msgType1/sequenceNumber", "sequenceNumber" ]
    }

If I use the XML filter above, a field for sequenceNumber is not created.
However if I comment the first line (i.e. #store_xml => false), multiple fields are generated, but still the field sequenceNumber is not generated.

Any advice on this is highly appreciated.

Thanks.

Hi,
to know a bit more, can you share your LS version and the full config you're using?

  • purbon

Hi Purbon,

I'm using logstash-2.0.0

Config file is,

input {
    file {
        path => "D:\PROJECTS\kibana\log_file"
        start_position => beginning
    sincedb_path => "D:\PROJECTS\kibana\since_db_tmp"
        delimiter => "

"
    }
}

filter {

    kv{ field_split => "\u0001"}

    
    if [message] =~ /^$/ {
         drop {}
         }

    xml {
        store_xml => false
        source => "src"
        target => "msgType1"
        xpath => [ "/msgType1/sequenceNumber", "sequenceNumber" ]
    }
}

output {
    elasticsearch {
        hosts => ["127.0.0.1:9200"]
        action => "index"
        index => "logstash-test"
        template_overwrite => true
    }
    stdout { }
}

Thanks.

One more thing,

The XML message is multi line as shown in my first post. It is not received as a single line.

Yeah, this seems like a bug in the xml filter. The destination field is populated if I change the XPath query to "/*" but with "/msgType1" I get nothing.