so I just discovered logstash and I managed to extract data from log files. This time, I have to extract information on MULTIlines, I show you an example:
2016-03-07 14:09:11,613 INFO [][com.ole.ecom.jms.crm.JmsCrmSender] Envoi du message ...<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
I have to recover, information on the first line and I must also be able to extract the number in between <code> 8800356499460 </ code> and email between the <email> outiztest+prolivraison@gmail.com </ email >
I created a file conf but it doesnt work , i want to know how can i do a file conf to this type of file (.log) that contains informations on many lines.
if you see how I can do, help me I would be very grateful
Use a multiline codec to join the lines of a multiline event to a single Logstash event, then use the grok filter to extract the various pieces of the log into separate fields, then use the xml filter to parse the XML document.
I show you an example line : 2016-03-07 14:09:31,607 INFO [][com.outiz.ecom.jms.crm.JmsCrmSender] Envoi du message ...<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <root xmlns="http://wlsosb.pointp.saint-gobain.net/hybris_crm/"> <order> <numCommande>00130175</numCommande> <idHybris>8804793647149</idHybris> <numCompteHybris>8803711877124</numCompteHybris> <commandePasseParIdUser></commandePasseParIdUser> <commandePasseParProfil>WEB_PRO</commandePasseParProfil> <produitsAsiles></produitsAsiles> <montantTTC>222.7</montantTTC> <montantHT>192.09</montantHT> <dateCommande>2016-03-07T11:26:16.000+01:00</dateCommande> <horodatageCreation>2016-03-07T11:26:16.000+01:00</horodatageCreation> <horodatageModification>2016-03-07T14:09:24.803+01:00</horodatageModification> <contact>outiztest+wuwu@gmail.com</contact> <coupon>true</coupon> <codeCoupon>COUPON MAI</codeCoupon> <codePromotion>199</codePromotion> <nomCoupon>1000 utilisation</nomCoupon> <reductionCouponTTC>0.0</reductionCouponTTC> <reductionCouponHT>0.0</reductionCouponHT> <reductionCouponHTPercent>0.00</reductionCouponHTPercent> <montantBundle>0</montantBundle> <montantReducBundle></montantReducBundle> <montantReducBundleTTC></montantReducBundleTTC> <lieuPayment>WEB</lieuPayment> <moyenPayment>VISA</moyenPayment> <canalVente>OUTIZ</canalVente>
and I want to get the first line with date ..., and the commande number on this example.
I write my filtre for this type of message as follows : grok { match => { "message" => "%{YEAR:YR}-%{MONTHNUM:MNTNUM}-%{MONTHDAY:MNTDAY}[ ]%{HOUR:HR}:?%{MINUTE:MIN}:?%{SECOND:SEC} %{LOGLEVEL:LOG}(\s*)\[\]\[%{NOTSPACE:CLASS}\]((.*)(\n))*(\s*)<order>(\n)(\s*)(<numCommande>)%{GREEDYDATA:NUMCMD}(</numCommande>)" } add_field => { "haspath" => "yes" "endpoint" => "%{NUMCMD}" } break_on_matchl => true add_tag => "orderCRM" remove_tag => ["_grokparsefailure"] }
and input as follows : input { file { type => "HYBRIS" path => "C:\Users\O6665391\ELK\logstash-2.2.2\multiline.log" codec => multiline { pattern => "^%{TIMESTAMP_ISO8601}" negate => true what => "previous" }}}
the problem is that on grokdebug, my filter works but with the .conf file, it doesn't work.
I have no error message, but it does not work
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.