Uploading an xml file into logstash

Hi ,
i already upload my xml file into logtsahs, everything sames okey but the result is not like i want, because i want to specify to each tag to a column so i can do my search in kibana using a name of a column but not searching in all document .
This is the result in kibana :

      {
        "_index": "tizer",
        "_type": "tizerlfiles",
        "_id": "AWhvriZgrW4BZjJkO214",
        "_score": 1,
        "_source": {
          "@version": "1",
          "host": "DESKTOP-2LL9494",
          "path": "D:/test.xml",
          "@timestamp": "2019-01-21T09:11:50.242Z",
          "message": """
<?xml version="1.0" encoding="UTF-8"?>
<tns:Invoicing xmlns:tns="xxx">
	<tns:FileNumber>20180919093512</tns:FileNumber>
	<tns:FileDate>2018-09-19</tns:FileDate>
	<tns:Forwarder>DBS</tns:Forwarder>
	<tns:Invoice>
		<tns:InvoicingDate>"2018-09-17"</tns:InvoicingDate>
		<tns:InvoicingBranch>SJCPTY</tns:InvoicingBranch>
		<tns:InitialInvoiceNumber>0</tns:InitialInvoiceNumber>
		<tns:ChanelInvoicingRefrence>257805</tns:ChanelInvoicingRefrence>
		<tns:FFInvoiceLineRefrence>01041000</tns:FFInvoiceLineRefrence>
		<tns:PayerAccountNameCode>200111</tns:PayerAccountNameCode>
		<tns:BusinessDivision>FA</tns:BusinessDivision>
		<tns:ETD>2018-09-16</tns:ETD>
		<tns:ETA>2018-09-16</tns:ETA>
		<tns:TypeConsignee>Other</tns:TypeConsignee>
		<tns:Consignee>EUROPERFUMERIA</tns:Consignee>
		<tns:Customer>OTHER</tns:Customer>
		<tns:Departure>
			<tns:CountryCode>FR</tns:CountryCode>
			<tns:CodePostal>95470</tns:CodePostal>
			<tns:City>VEMARS</tns:City>
			<tns:Address>"P.A. des Portes de Vemars CR9. Rue de la Haie Marteau "</tns:Address>
			<tns:Airport>
				<tns:Code>CDG</tns:Code>
				<tns:Denomination>FRANCE</tns:Denomination>
			</tns:Airport>
			<tns:Region>EMEA</tns:Region>
		</tns:Departure>
	</tns:Invoice>
""",
          "tags": [
            "multiline"
          ]
        }   

This is my config file :

input {
 file {
  path => "D:/test.xml"
  start_position => beginning
  sincedb_path => "NUL"
  codec => multiline {
  pattern => "<invoicing>|</invoicing>"
  negate => "true"
  what => "previous"
  auto_flush_interval => 1
  max_lines => 3000
  }
  
 }
}


filter {
  xml {
   source => "message"
   target => "message.parsed"
   store_xml => false
   force_array => false
 
  }
 
}

output {
  stdout { codec => rubydebug }
elasticsearch {
  index => "tizer"
  hosts => ["localhost:9200"]
  document_type => "tizerlfiles"
 }
}

Hello,

please find the sample example to load xml file here .

Hope this helps you

i ve tried all exampls but it doesn t work plz can u look at my prob plz @balumurari1

you need to use xpath to split the xml to get proper output as shown in the example (Unable to load xml file in logstash) previously

the xpath work very well but their is a small prob when their is tags in tree format.
for example it work when i write
xpath => [ "/A/FileNumber/text()", "FileNumber"]
but Doesn t work for others tags like this one :

xpath => [ " /A/B/number/text()", "number", 
           "/A/B/words/text()", "words"  ] 

@balumurari1

Hello,

Make sure your xml is a standard xml(have same tags repeatedly)

Also, you can have any number of tags, and that will not be a problem. check your code properly.

Regards

<?xml version="1.0" encoding="UTF-8"?>
<tns:Invoicing xmlns:tns="xxxxx">
	<tns:FileNumber>20180919093512</tns:FileNumber>
	<tns:FileDate>2018-09-19</tns:FileDate>
	<tns:Forwarder>DBS</tns:Forwarder>
	<tns:xoxoxo>yyyyyyyyy</tns:xoxoxo>
	<tns:Invoice>
		<tns:InvoicingDate>"2018-09-17"</tns:InvoicingDate>
		<tns:InvoicingBranch>SJCPTY</tns:InvoicingBranch>
		<tns:InitialInvoiceNumber>0</tns:InitialInvoiceNumber>
		<tns:ChanelInvoicingRefrence>257805</tns:ChanelInvoicingRefrence>
		<tns:FFInvoiceLineRefrence>01041000</tns:FFInvoiceLineRefrence>
		<tns:PayerAccountNameCode>200111</tns:PayerAccountNameCode>
		<tns:BusinessDivision>FA</tns:BusinessDivision>
		<tns:ETD>2018-09-16</tns:ETD>
		<tns:ETA>2018-09-16</tns:ETA>
		<tns:TypeConsignee>Other</tns:TypeConsignee>
		<tns:Consignee>EUROPERFUMERIA</tns:Consignee>
		<tns:Customer>OTHER</tns:Customer>
		<tns:Departure>
			<tns:CountryCode>FR</tns:CountryCode>
			<tns:CodePostal>95470</tns:CodePostal>
			<tns:City>VEMARS</tns:City>
			<tns:Address>"P.A. des Portes de Vemars CR9. Rue de la Haie Marteau "</tns:Address>
			<tns:Airport>
				<tns:Code>CDG</tns:Code>
				<tns:Denomination>FRANCE</tns:Denomination>
			</tns:Airport>
			<tns:Region>EMEA</tns:Region>
		</tns:Departure>
	</tns:Invoice>
</tns:Invoicing>

how do u look this xml format ?? @balumurari1

Yup..!! this xml is good.
show me input code so that i can help you...!!!

@balumurari1
this my config file :

input
{
file
{
path => "D:\test.xml"
start_position => "beginning"
sincedb_path => "NUL"
codec => multiline {
pattern => "<tns:invoicing>|</tns:invoicing>"
negate => "true"
what => "previous"
auto_flush_interval => 1
max_lines => 10000
}
}
}
filter
{
xml
{
source => "message"
target => "parsed"
store_xml => "false"
xpath => [
"/tns:Invoicing/tns:FileNumber/text()", "FileNumber",
"/tns:Invoicing/tns:FileDate/text()", "FileDate",
"/tns:Invoicing/tns:Forwarder/text()", "Forwarder",
"/tns:Invoicing/tns:invoice/tns:InvoicingDate/text()", "InvoicingDate",
"/tns:Invoicing/tns:invoice/tns:InvoicingBranch/text()", "InvoicingBranch",
"/tns:Invoicing/tns:xoxoxo/text()", "xoxoxo"
]
}
mutate {	
remove_field => [ "message"]	
}
}

output
{
stdout
{
codec => rubydebug
}
elasticsearch {
  index => "chelou002"
  hosts => ["localhost:9200"]
  document_type => "chelouFiles002"
 }
}

All tags ar uploaded very good but not those 2 i dont know why:

    "/tns:Invoicing/tns:invoice/tns:InvoicingDate/text()", "InvoicingDate",
    "/tns:Invoicing/tns:invoice/tns:InvoicingBranch/text()", "InvoicingBranch",

change path to

D:/test.xml

Also in the xml file why do you have date field with double quotes. Remove double quotes in xml file, which will help to resolve the issue

i got the xml file in this format i cant modify 100 xml files each time :wink:

ok but the date field with double quotes in invalid for the xml file standards.

So the problem is with xml now. :slight_smile:

Check once by removing it.
Hope it works

it s not working unfortunately :frowning: @balumurari1

i guess the prob is in the column who had atleast 2 tags up of them right!! @balumurari1

Dude..!! modify this to

"/tns:Invoicing/tns:Invoice/tns:InvoicingDate/text()", "InvoicingDate",

it s a dumb error hahahha, thanks bro its works now :slight_smile: @balumurari1

hi again @balumurari1,
it s normal to have your xml file in the same line like this :

<?xml version="1.0" encoding="UTF-8" ?>
<tns:Invoicing xmlns:tns="xxxxr"><tns:FileNumber>20180919093512</tns:FileNumber><tns:FileDate>2018-09-19</tns:FileDate></tns:Invoicing>

because i alwayse have an error when i upload this file like that otherwise when i formt it to be like that :

<?xml version="1.0" encoding="UTF-8" ?>
<tns:Invoicing xmlns:tns="xxxxr">
       <tns:FileNumber>20180919093512</tns:FileNumber>
       <tns:FileDate>2018-09-19</tns:FileDate>
</tns:Invoicing>

it work i dont know where is the issue , i just need to format each xml file befor uploading ???

best regards

make sure the xml is standard xml and everything works fine.

Also mark your answer as solution which will help others

iit s standard xml but i dont know why he do this error but when i format it he works ^^ @balumurari1