Logstash xml parsing

Hi guys,

I have a xml parsing problem...
this is a very smal part of the xml (the xml's are about 500MB and a lot of different nested fields:

<?xml version="1.0" encoding="utf-8"?>
<project id="0001" name="TEST_CASE">
  <Extractions>
    <extractionInfo id="9" name="PHONE"/>
    <extractionInfo id="10" name="SIM"/>
  </Extractions>
  <metadata section="Extraction Data">
    <item name="DeviceInfo" sourceExtraction="10"><![CDATA[THIS_I_NEED_1]]></item>
  </metadata>
  <metadata section="Device Info">
    <item id="683c9ea2a37f" name="IMEI" sourceExtraction="9"><![CDATA[THIS_I_NEED_2]]></item>
  </metadata>
</project>

I was trying to retrieve the "THIS_I_NEED" strings
But I can't seem to get it to work.
this is my xml filter:

input {
	file {
		type => XML_Report
		start_position => "beginning"
		sincedb_path => "/dev/null"
		path => "/DATA/*/report.xml"
	}
}
filter {
	xml {
		store_xml => "false"
		source => "message"
		xpath =>[
			"/project/metadata/[@Device Info=@section]/item[@name=@IMEI]/text()","IMEI"
		]
	}
}
output {
	elasticsearch {
	host => ["localhost:9200"]
	index => "logstash-%{[type]}"
	}
}

Any ideas? Because I've tried many combinations, but nothing seems to work...
thanks in advance!

You probably need to use multine before xml filter. Something like:

multiline {
pattern => ""
negate => "true"
what => "next"
}

https://www.elastic.co/guide/en/logstash/current/plugins-codecs-multiline.html

Anyway if you post which errors are you getting would be easier to help.

Hope it helps.

Regards

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.