Xpath does not work, but it does in xmlspy

I have a wellformed XML in my DB-column. I get it out the DB with logstasj JDBC plugin.
I then get an XML with escapecharacters in.
In the XML filter I try to parse some stuff from it with an xpath. the xpath works in xmlspy, but not in logstash xml filter.

xml {
source => "coxml"
xpath => [ "/CORequest/COForm/COFormContent/Signatory/SignatoryLocation/text()" , "plaats" ]
target => "certificate"
}

Without seeing the XML in question I don't think there's any way to help out.

This is it:

<CORequest>
  <COForm>
    <COFormContent>
      <Consignor>
        <AddressName>Jean-Louis Michel</AddressName>
        <AddressStreetAndNumber>Soesterdijkstraatwegdreef</AddressStreetAndNumber>
        <AddressCity>Kessel-Lo</AddressCity>
        <AddressPostalCode>3010</AddressPostalCode>
        <AddressCountry>België</AddressCountry>
        <OnBehalfOf>
          <AddressName>Michel Jean-Louis</AddressName>
          <AddressStreetAndNumber>Doelweg 69</AddressStreetAndNumber>
          <AddressCity>Antwerpen</AddressCity>
          <AddressPostalCode>2030</AddressPostalCode>
          <AddressCountry>België</AddressCountry>
        </OnBehalfOf>
      </Consignor>
      <Consignee>
        <AddressName>Obarak</AddressName>
        <AddressStreetAndNumber>The way to president?</AddressStreetAndNumber>
        <AddressCity>Brussel</AddressCity>
        <AddressPostalCode>1000</AddressPostalCode>
        <AddressCountry>België</AddressCountry>
      </Consignee>
      <Origins>
        <CountryOfOriginCollection>
          <Country>BRITISH INDIAN OCEAN TERRITORY(IO)</Country>
          <Country>MADAGASCAR(MG)</Country>
          <Country>MALI(ML)</Country>
        </CountryOfOriginCollection>
      </Origins>
      <Transport>De Belg heeft nu al massaal veel reizen geboekt voor tijdens de vakantie van Allerheiligen. De touroperators moeten al extra vliegtuigen inlassen om aan de grote vraag te kunnen voldoen. Dat blijkt uit een enquête bij de touroperators.</Transport>
      <Remarks>"Voor wat betreft de ideale vertrek- en terugkeerdata tijdens Allerheiligen is alles volzet", zegt een woordvoerder van Best Tours. Bij Thomas Cook is de situatie gelijkaardig. "We zijn verplicht om extra capaciteit te voorzien", aldus woordvoerder Claude Pérignon.</Remarks>
      <Shipment>
        <ShipmentDescription>Deze tendens lijkt verrassend in een periode dat de koopkracht daalt en de olieprijs almaar stijgt. Tijdens Allerheiligen vertrekken vooral Belgen op vakantie die tijdens de zomer gewerkt hebben, zegt Pérignon.</ShipmentDescription>
        <ShippedGoods>1. radio's
2. telefoons
3. GSM's
4. netbooks
5. leesboeken
6. tijdschriften
7. aardappelnootjes
8. radio's
9. telefoons
10. GSM's
11. netbooks
12. leesboeken
13. tijdschriften
14. aardappelnootjes
15. bijlage toegevoegd?
17. nummer zeventien
18. een achttiende artikel
19. en nog eentje erbij
20. die lijst wordt lang
21. wat is nu die bijlage?</ShippedGoods>
        <ShippingMarks />
      </Shipment>
      <Quantity>12 stuks
2 dozen</Quantity>
      <Signatory>
        <SignatoryName>PWOndernemer</SignatoryName>
        <SignatoryLocation>Drongen</SignatoryLocation>
        <SignatureDate>2008-08-19</SignatureDate>
      </Signatory>
      <Customs>
        <DestinationCountry>EUROPEAN COMMUNITY</DestinationCountry>
        <ValueInEuro>12345.66</ValueInEuro>
        <CustomCodeCollection>
          <CustomCode>6204000000</CustomCode>
          <CustomCode>6204000000</CustomCode>
          <CustomCode>6204000000</CustomCode>
          <CustomCode>6204000000</CustomCode>
        </CustomCodeCollection>
      </Customs>
      <AdditionalInformation>Als Egypte en Tunesie goekkope reizen zijn en voor de lagere middenklasse,wel ik ben ferm content dat ik daar nog naartoe kan. En wat betreft de hotels ,er zijn er daar 4 en 5* hotels die heel mooi zijn en zeker voldoen aan een zon en zee vakantie.En de zon die schijnt voor iedereen hetzelfde ,of je nu een hotel hebt van 1200 euro per pp of eentje van 600 euro per persoon.</AdditionalInformation>
    </COFormContent>
    <COFormAttachmentCollection>
      <COAttachment forDeclarationOfOwnManufacturing="true">
        <Reference>204</Reference>
        <Base64Hash>R/h0KXjgsYs2ztc4P+MFztLyQzc=</Base64Hash>
      </COAttachment>
      <COAttachment forDeclarationOfOwnManufacturing="true">
        <Reference>203</Reference>
        <Base64Hash>S4Iq3qWE/Q9+DzXxS3qvOqoBmzA=</Base64Hash>
      </COAttachment>
    </COFormAttachmentCollection>
  </COForm>
</CORequest>

Okay, looks good. Please show your configuration and an example event produced by Logstash (use a stdout { codec => rubydebug } output). Maybe you're not feeding Logstash the right data.

config

input {
	jdbc {
		jdbc_driver_library => "C:\path\to\enu\mssql-jdbc-6.4.0.jre8.jar"
		jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
		jdbc_connection_string => "jdbc:sqlserver://XXX;databasename=XXX;"
		jdbc_user => "XXX"
		jdbc_password => "XXX"
		schedule => "27 17 * * *"
		statement => "SELECT COID, COXML, AcceptDenyDateTime AS 'Aanvaard of a-posteriori op', Chamber.ChamberName AS 'Kamer', [User].UserFirstname AS 'Voornaam', [User].UserLastname AS 'Naam', [Status].StatusShortDescription AS 'Status', Company.CompanyName AS 'Bedrijf' FROM [XXX].dbo.CO LEFT JOIN [XXX].dbo.Chamber ON CO.ChamberID = Chamber.ChamberID LEFT Join [XXX].[dbo].[User] ON CO.UserID = [User].UserID LEFT Join [XXX].[dbo].[Status] ON CO.StatusID = [Status].StatusID LEFT JOIN [XXX].[dbo].[Company] ON CO.CompanyID = Company.CompanyID"
	 }
}
filter {
	xml {
		source => "coxml"
		target => "certificate"
	}
}
output {
	elasticsearch { 
		hosts => ["localhost:9200"]
		index => "certificates" 
		manage_template => false
  }
}

with this output

{
                        "bedrijf" => "OTN SYSTEMS NV",
                         "status" => "Accepted",
                          "kamer" => "Kempen",
                           "naam" => "Derboven",
                       "@version" => "1",
                     "@timestamp" => 2018-08-06T08:09:07.972Z,
                           "coid" => 2303,
                    "certificate" => {json here - too long to post},
                          "coxml" => "xml here - too long to post",
    "aanvaard of a-posteriori op" => 2009-04-09T10:03:07.843Z,
                       "voornaam" => "Liliane"
}

Looks good. No clues in the Logstash log?

Nothing.

But I do wonder, logstash added escaping backslashes before the double quotes in the XML. I feel that's why xpath does not work

"<CORequest xmlns=\"some namespace here\"><COForm>...

the same in the DB field:

<CORequest xmlns="some namespace here">
  <COForm>
    <COFormCo ...

and in ES (through Kibana) they are not there

But I do wonder, logstash added escaping backslashes before the double quotes in the XML.

Where are those visible?

They are visible on the logstash stdout output

Logstash prints double-quoted strings so any double quotes in those strings will of course be escaped.

Still the xpath does not work
is there anything else I should check?

Sorry, I don't have any suggestions left.

I got these lines out of debug mode

[2018-08-06T12:47:02,160][DEBUG][logstash.filters.xml ] Running xml filter {:event=>#LogStash::Event:0x35650d8}
[2018-08-06T12:47:02,924][DEBUG][logstash.filters.xml ] Event after xml filter {:event=>#LogStash::Event:0x35650d8}

any idea where I could pick up the data concerning this event?

or what should I do now? can I talk to a dev?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.