Parsing nested XML into logstash

<eg>
	<suite>
		<some_data_1 attr1>test 1
		<some_data_2 attr2>test 2
	</suite>
	<suite>
		<some_data_3 attr3>test 3
		<some_data_4 attr4>test 4
	</suite>
</eg>
<appendix>
	<more info>
		<some_data_1 attr1>Info1
		<some_data_2 attr2>Info2
	</more info>
	<more info>
		<some_data_1 attr3>info3
		<some_data_2 attr4>info4
	</more info>
</appendix>

How can I parse this XML file in logstash such that I can tag the more info under some data. An example is shown below:

<eg>
	<suite>
		<some_data_1 attr1>test 1
                        <some_data_1 attr1>Info1

With this I can map the field into elastic as the same index but the problem is how do I do it first in logstash. I explored XML filter xpath and split filter but both does not work? Any help would be much appreciated.

The xml filter would be the right avenue here. What problems did you have?

Multiline codec on your input along with the XML filter's xpath functions will get you what you want. The Elastic Stack sees related data as an Event. Delineation between different events is done per line. In your example, Logstash thinks you just fed it 20 events. To fix this, use the Multiline codec to cram everything onto a single line. In your XML Filter, use xpath to define what to label the field as and what the data in the field is. For example:

filter {
  xml {
    xpath => [
      "/eg/suite/some_data_1_attr1/text()", "Attribute 1"
    ]
  }
}

This will give you a field named Attribute 1 with a value of test 1.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.