Create an array from a nested XML field

Shayleen · December 16, 2015, 4:18pm

Hi,

I'm quite new in ELK and I've encountered a problem almost imposible to break for me.

I have an XML like this:
< root >
< field_A>
< subfield_A >8100291240
< subfield_B >3
< subfield_C >6000436355
< subfield_D >
< subfield_DD >1000
</ subfield_D >
</ field_A>
...
< field_A>
< subfield_A >8100291240
< subfield_B >3
< subfield_C >6000436355
< subfield_D >
< subfield_DD >1000
</ subfield_D >
</ field_A>

The field_A with that format repets for the whole XML thousands of times. My first approach was to use the XML filter to obtain the elements, but I only need two subfields from field_A.

At first, I used this:
add_field => {
field_A => "%{[root][field_A][0][subfield_A ]}"
}

I changed the 0 for a 1, and true enough, I was able to access the second element of the array. It worked like charm but... The problem is that I need to use the same field ALL the time and I don't know beforehand how many "field_A" can I find in the XML. I tried to look for some kind of loop in logstash... No luck.

So, I decided to use the ruby filter, it took me a while but I was able to navigate inside the nested fields but again, same problem, I could only access to an specific element. For that I used this:

code => "event['root'] = event['root']['field_A'][1]['subfield_A']".

So, my question is, how can I use a single key for logstash, having multiple values knowing that this "funcionality" is written inside a way larger configuration file?

In other words, ideally, I'll need something like this:

field_A => {
[subfield_A , subfield_B ],
[subfield_A , subfield_B ],
.
.
.
[subfield_A , subfield_B ]
}

I'm already losing my mind, any help would be appreciated.

colings86 · December 16, 2015, 4:23pm

I'm going to change the category here to Logstash as you will get access to more people who know about Logstash that way.

magnusbaeck · December 17, 2015, 10:55am

So... you want to extract the contents of all subfield_A and subfield_B subelements from all field_A elements?

Turn

<root>
  <fieldA>
    <subfield_A>1</subfield_A>
    <subfield_A>2</subfield_A>
  </fieldA>
  <fieldA>
    <subfield_A>3</subfield_A>
    <subfield_A>4</subfield_A>
  </fieldA>
</root>

into this:

{
  "field_A": ["1", "2", "3", "4"]
  ...
}

Shayleen · December 17, 2015, 11:18am

Hello, Magnus

Thanks for your answer, but I already know how to extract a single occurrency of field_A, my problem is, how to do that recursively AND store that information in the same field (with the same name).

All I've achieved so far is to overwritte the previous value or save only the first one.

Besides, I don't need only the value of the tags subfield_A and subfield_C, I also need to save the names of the tags to create a field in within, because I need to use those fields in my searchs on Kibana.

TURN THIS:
< root>
< shirt>
< color>red< /color>
< size>5< /size>
< /shirt>
.
.
.
< shirt>
< color>white< /color>
< size>6< /size>
< /shirt>
< /root>

INTO THIS:
{
"shirt": [
[ [ [color],"red"] , [ [size], 5] ],
.
.
.
[ [ [color],"white"] , [ [size], 6] ], ]
}

Do you know how to do this? Thanks for your help.

Topic		Replies	Views
I would like to access nested data from an array Logstash	7	7604	February 15, 2017
Xml filter array in field Logstash	1	385	January 23, 2019
Unable to split Arrays into fields -parsing XML Attributes of an unknown number of child elements Logstash	2	252	June 16, 2022
Insert multiple fields in nested array Logstash	3	174	April 8, 2023
Create nested field with XPath Logstash	4	2613	September 27, 2017

Create an array from a nested XML field

Related topics