Hi, i need extract one value from a parsed XML, my logstash configuration is
xml {
source => "contentRequest"
target => "contentRequest_field"
store_xml => false
xpath => [ "/datianagrafici/datipersonali/codicefiscale", "codicefiscale" ]
}
If i set store_xml => true
i see this extracted filelds
I need extract the value of codicefiscale
, based on this view i set xpath
configuration in this way
xpath => [ "/datianagrafici/datipersonali/codicefiscale", "codicefiscale" ]
I also tried to set
xpath => [ "//datianagrafici/datipersonali/codicefiscale", "codicefiscale" ]
but I've same result. I'm expectd a new field into elasticsearch called codicefiscale
, but I don't see it, is correct?
Thanks
Badger
December 5, 2022, 5:37pm
2
What does the source XML look like?
Unfortunately the XML is incorrectly formatted, but logstash xml plugin extract correctly all fields, I don't know if this is a problem for xpath, contentRequest
is exactly:
{ "xml": "<lavoratore><datiinvio><dataultimoagg>2019-05-22</dataultimoagg><codiceentetit>xxxxxxx</codiceentetit><tipovariazione>01</tipovariazione><datadinascita>1985-08-23</datadinascita></datiinvio><datianagrafici><datipersonali><codicefiscale>xxxxxxxxxx</codicefiscale><cognome>xxxxxx</cognome><nome>xxxxxxxxx</nome><sesso>x</sesso><datanascita>1985-08-23</datanascita><codcomune>xxxx</codcomune><codcittadinanza>xxx</codcittadinanza></datipersonali><residenza><codcomune>xxxx</codcomune><cap>xxxx</cap><indirizzo>xxxxxxx</indirizzo><localita /></residenza><domicilio><codcomune>xxx</codcomune><cap>xxxxx</cap><indirizzo>xxxxxxxx</indirizzo></domicilio><recapiti><telefono>0000000000</telefono><cellulare>3xxxxxxx</cellulare></recapiti></datianagrafici></lavoratore>"}
Indented version:
<lavoratore>
<datiinvio>
<dataultimoagg>xxxx-xx-xx</dataultimoagg>
<codiceentetit>xxxxxxx</codiceentetit>
<tipovariazione>xx</tipovariazione>
<datadinascita>xxxx-xx-xx</datadinascita>
</datiinvio>
<datianagrafici>
<datipersonali>
<codicefiscale>xxxxxxxxxx</codicefiscale>
<cognome>xxxxxx</cognome>
<nome>xxxxxxxxx</nome>
<sesso>x</sesso>
<datanascita>xxxx-xx-xx</datanascita>
<codcomune>xxxx</codcomune>
<codcittadinanza>xxx</codcittadinanza>
</datipersonali>
<residenza>
<codcomune>xxxx</codcomune>
<cap>xxxx</cap>
<indirizzo>xxxxxxx</indirizzo>
<localita />
</residenza>
<domicilio>
<codcomune>xxx</codcomune>
<cap>xxxxx</cap>
<indirizzo>xxxxxxxx</indirizzo>
</domicilio>
<recapiti>
<telefono>0000000000</telefono>
<cellulare>xxxxxxx</cellulare>
</recapiti>
</datianagrafici>
</lavoratore>
Thanks
Badger
December 5, 2022, 6:55pm
4
The parsing used for store_xml is fundamentally different to the parsing used for xpath. The former uses the xml-simple library, the latter uses Nokogiri.
Nokogiri only works on correctly formatted XML. XmlSimple tolerates all kinds of leading or trailing junk around the XML.
Try
json { source => "contentRequest" target => "[@metadata][contentRequest]" }
xml {
source => "[@metadata][contentRequest][xml]"
store_xml => false
xpath => { "//datianagrafici/datipersonali/codicefiscale/text()" => "codicefiscale" }
}
You saved me! So it's like I thought, xpath thinks differently
Thanks!
system
(system)
Closed
January 2, 2023, 8:37pm
6
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.