Hello together,
I tried multiple solutions for parsing my logs which are XML files to JSON in Logstash.
One log file does look like in this way:
<log level="INFO" time="Wed Sep 09 09:18:48 EDT 2015" timel="1441804728245" id="123456789" cat="COMMUNICATION" comp="" host="127.0.0.0.1" req="" app="" usr="" thread="" origin="">
<msg>
<![CDATA[Method=GET URL=http://test.de/24dsdf3=0TReq(provider=Test, Decoding_Feat=[], Accept-Encoding=gzip, Accept=*/*) Result(Content-Encoding=[gzip], Content-Length=[3540], ntCoent-Length=[6660], Content-Type=[text/xml; charset=utf-8]) Status=200 Times=TISP:426/CSI:-/Me:0/Total:426]]>
</msg>
<info>
</info>
<excp>
</excp>
</log>
<log level="INFO" time="Wed Sep 09 09:18:48 EDT 2015" timel="1441804728245" id="123456789" cat="COMMUNICATION" comp="" host="127.0.0.0.1" req="" app="" usr="" thread="" origin="">
<msg>
<![CDATA[Method=GET URL=http://test.de/24dsdf3=0TReq(provider=Test, Decoding_Feat=[], Accept-Encoding=gzip, Accept=*/*) Result(Content-Encoding=[gzip], Content-Length=[3540], ntCoent-Length=[6660], Content-Type=[text/xml; charset=utf-8]) Status=200 Times=TISP:426/CSI:-/Me:0/Total:426]]>
</msg>
<info>
</info>
<excp>
</excp>
</log>
<log level="INFO" time="Wed Sep 09 09:18:48 EDT 2015" timel="1441804728245" id="123456789" cat="COMMUNICATION" comp="" host="127.0.0.0.1" req="" app="" usr="" thread="" origin="">
<msg>
<![CDATA[Method=GET URL=http://test.de/24dsdf3=0TReq(provider=Test, Decoding_Feat=[], Accept-Encoding=gzip, Accept=*/*) Result(Content-Encoding=[gzip], Content-Length=[3540], ntCoent-Length=[6660], Content-Type=[text/xml; charset=utf-8]) Status=200 Times=TISP:426/CSI:-/Me:0/Total:426]]>
</msg>
<info>
</info>
<excp>
</excp>
</log>
I have in a log file multiple logs (in this example 3).
I tried this filter:
input {
file {
path => "/path/to/file.log.*"
start_position => "beginning"
}
}
filter{ multiline {
pattern => "<log"
negate => "true"
what => "previous"
}
xml {
store_xml => "false"
source => "message"
xpath => [
"/log/at.level", "level",
"/log/at.time", "time",
"/log/at.timel", "timel",
"/log/at.id", "id",
"/log/at.cat", "cat",
"/log/at.comp", "comp",
"/log/at.host", "host",
"/log/at.req", "req",
"/log/at.app", "app",
"/log/at.usr", "usr",
"/log/at.thread", "thread",
"/log/at.origin", "origin",
"/log/msg/text()","msg_txt"
]
}
}
output {
elasticsearch {
hosts => "localhost:9200"
}
}
Of course, the "at's", must be replaced with the at sign.
But when I starting to run Logstash it creates really weird output. However, as a consequence Elasticsearch cannot read it.
Maybe do you have some suggestions, where I miss something?
Best regards,
Oemer