How to read XML format within a .log file


(David Garces) #1

Hi all :slight_smile:

I have an app that writes XML format in a .log file per transaction it receive.
So basically, this is the XML format:

<Bnx>
    <HEADER orig="9eb23c4d0a05e60a210200ce00000626" App="3" IdServ="test" IdTran="test" OpeDate="Thu Dec 28 14:56:09 COT 2017" Lang="EN" />
    <BnxChild>
        <BnxDescription>
            <medetail>test</medetail>
            <tedetail>example</tedetail>
            <rrndetail>123</rrndetail>|
            <lodetail>500</lodetail>
            <docdetail>6</docdetail>
            <date_detail>2017-09-27</date_detail>
            <hour_detail>14:20:00</hour_detail>
            <dec_detail>1.50</dec_detail>
        </BnxDescription>
    </BnxChild>
</Bnx>

And here is my logstash's .config file:

input {
file {
path => "/path/to/file/example.log"
start_position => "beginning"
type => "bnxdata"
codec => multiline {
pattern => "</Bnx>"
negate => "true"
what => "previous"
multiline_tag => "test_multiTag"
max_lines => 1000
auto_flush_interval => 1
}
}
}

filter {
if [type] == "bnxdata" {

xml {
source => "message"
target => "parsed"
add_field => {
Bnx => "%{[parsed][Bnx]}"
BnxChild => "%{[parsed][BnxChild]}"
}
xpath => [
"//Bnx/BnxChild/BnxDescription/@medetail/text()", "medetail",
"//Bnx/BnxChild/BnxDescription/@tedetail/text()", "tedetail",
"//Bnx/BnxChild/BnxDescription/@rrndetail/text()", "rrndetail",
"//Bnx/BnxChild/BnxDescription/@lodetail/text()", "lodetail"
]
}

date {
match => ["endTime", "yyyy-MM-dd HH:mm:ss", "ISO8601"]
}
}
}

output {
if [type] == "bnxdata" {
stdout {codec => rubydebug}
elasticsearch {
hosts => ["http://localhost:9200/"]
index => "auth2-%{+YYYY.MM.dd}"
document_type => "bnxdata"
}
}
}

Attempting to launch logstash i'm getting the following error:

{
"message" => " \n <HEADER orig="9eb23c4d0a05e60a210200ce0
0000626" App="3" IdServ="test" IdTran="test" OpeDate="Thu Dec 28 14:56:0
9 COT 2017" Lang="EN" />\n \n \n
test\n example</ted
etail>\n 123|\n 5
00\n 6\n <date_d
etail>2017-09-27</date_detail>\n <hour_detail>14:20:00</hour_deta
il>\n <dec_detail>1.50</dec_detail>\n </BnxDescription

\n ",
"@version" => "1",
"@timestamp" => 2018-02-14T20:08:12.155Z,
"tags" => [
[0] "test_multiTag",
[1] "_xmlparsefailure"
],

Does anyone know what does it means and how can I solve it?
Or if someone knows another appropriate way to read the XML format i'll appreciate it :smiley:
-Regards


(Magnus Bäck) #2

Your multiline configuration includes everything up to but not including </Bnx>. Does example.log contain multiple log entries or can you just slurp the whole file into a single event?


(David Garces) #3

Yep
The log contains multiple xml entries.


(Magnus Bäck) #4

Then your multiline configuration should look like this:

pattern => "^<Bnx>"
negate => true
what => "previous"

That is, unless the current line is the first line of an XML document, merge this line with the previous line.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.