Urgent Help : How to Parse nested XML (Key value patterns) with multiple Parent children

Hello Community,
I have minimal exp. in xml logstash parsing stuff.
I have explored various discussions , one such which matches a bit to my case is : [SOLVED] Split filter question a.k.a flatten json sub array - #10 by blastik , which I tried but didn't work for me due to multiple hierarchy.
I have problem parsing voice console data which looks like below :

<?xml version="1.0" encoding="UTF-8"?>
<terminalCommand serialNumber="6217015078" commandID="1261">
  <timestamp>2022-07-21 09:43:48 GMT-04:00</timestamp>
  <terminalState id="3">SLEEPING</terminalState>
  <command>
    <sendTerminalProperties commandVersion="2">
      <property name="multipleResidentTtsEnabled" value="true" />
      <group name="Bluetooth">
        <property name="discoverable" value="false" />
        <property name="scanner_supported" value="true" />
        <group name="PairHeadset">
          <property name="address" value="00:14:28:06:A3:66" />
          <property name="status" value="Searching" />
          <property name="type" value="Headset" />
        </group>
      </group>
      <group name="Network" />
      <group name="Battery">
        <property name="BattFirstUseDateUtc" value="1654228800" />
        <property name="BatteryHealthCode" value="0" />
        <property name="ChargeCapacity" value="8.902" />
        <property name="PartNumber" value="BT-901" />
        <property name="SerialNumber" value="351639132105" />
      </group>
    </sendTerminalProperties>
  </command>
</terminalCommand>

My logstash conf is :

input {
  beats {
        port => 50xx
        client_inactivity_timeout => 1200
        }
}
filter {
    grok {
                match => { "message" => "^%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:loglevel}\s+(\[)%{DATA:logger}(\])\s+%{DATA:command} (\|)%{DATA:output}?\s*$(?<stacktrace>(?m:.*))" }
        }
     if "<?xml" in [stacktrace]
      {
      xml {
    source => "stacktrace"
    target => "xmldata"
}
}

But due the nature of data it is , I get multiple values for single field per event , for example :
xmldata.command.sendTerminalProperties.group.name : Bluetooth, Network, Battery
xmldata.command.sendTerminalProperties.group.property.name : discoverable, discoverable_modifiable, display_supported, enabled, enabled_modifiable, headset_manual_pairing_enabled, headset_manual_pairing_modifiable, headset_supported, printer_supported, scanner_supported, BattFirstUseDateUtc, BatteryHealthCode, ChargeCapacity, ChargeCycles, DesignCapacity, PartNumber, RunTime, SerialNumber
xmldata.command.sendTerminalProperties.group.property.value : false, false, true, true, true, true, true, true, true, 1645660800, 0, 9.719, 35, 10.73, 730051, 0, 3521401

My expected output should be like
xmldata.command.sendTerminalProperties.group.name : Battery
xmldata.command.sendTerminalProperties.group.property.name : BatteryHealthCode
xmldata.command.sendTerminalProperties.group.property.value : 2
xmldata.command.sendTerminalProperties.group.name : Battery
xmldata.command.sendTerminalProperties.group.property.name : ChargeCapacity
xmldata.command.sendTerminalProperties.group.property.value : 2

so that I can use particular values for dashboard.
Request quick advice.
Please let me know if any additional info required from me.
Thanks

The only way to have two objects with the same name is to make the object an array. I would significantly rearrange the data using ruby to expand all the name/value pairs.

    xml { source => "message" force_array => false target => "xmldata" }
    ruby {
        init => '
            def doSomething(object, name, keys, event)
#puts "doSomething called for #{name}"
                if object
                    if object.kind_of?(Hash) and object != {}
                        if name =~ /\[property\]$/
                            event.set(name, { object["name"] => object["value"] })
                        else
                            object.each { |k, v| doSomething(v, "#{name}[#{k}]", keys, event) }
                        end
                    elsif object.kind_of?(Array) and object != []
                        if name =~ /\[property\]$/
                            h = {}
                            object.each_index { |i|
                                h[object[i]["name"]] = object[i]["value"]
                            }
                            event.set(name, h)
                        else
                            object.each_index { |i|
                                doSomething(object[i], "#{name}[#{i}]", keys, event)
                            }
                        end
                    else
#puts "working on #{name}"
                    end
                end
            end
        '
        code => '
            xml = event.get("xmldata")
            if xml
                doSomething(xml, "[xmldata]", @field, event)
            end
        '
    }

This will get you

        "sendTerminalProperties" => {
                  "property" => {
                "multipleResidentTtsEnabled" => "true"
            },
            "commandVersion" => "2",
                     "group" => [
                [0] {
                        "name" => "Bluetooth",
                    "property" => {
                             "discoverable" => "false",
                        "scanner_supported" => "true"
                    },
                       "group" => {
                        "property" => {
                             "status" => "Searching",
                               "type" => "Headset",
                            "address" => "00:14:28:06:A3:66"
                        },
                            "name" => "PairHeadset"
                    }
                },
                [1] {
                    "name" => "Network"
                },

etc. I would consider further reorganizing the outer group object but that is a lot more code to write.

Thanks for your quick response Badger , in interest of time in case If I need just specific fields with corresponding values instead of flattening whole hierarchy e.g. specifically
BatteryHealthCode
ChargeCapacity
PartNumber
SerialNumber
from parent xmldata.command.sendTerminalProperties.group.name : Battery .
Is there any other way like split or something you can advise .
I tried split which you suggested in below (might be ,my code is not correct)

filter {
    grok {
                match => { "message" => "^%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:loglevel}\s+(\[)%{DATA:logger}(\])\s+%{DATA:command} (\|)%{DATA:output}?\s*$(?<stacktrace>(?m:.*))" }
        }
     if "<?xml" in [stacktrace]
      {
      xml {
    source => "stacktrace"
    target => "xmldata"
        }
}
         split { field => "[xmldata][command][sendTerminalProperties][group][name]" }
         split { field => "[xmldata][command][sendTerminalProperties][property][name]" }
         split { field => "[xmldata][command][sendTerminalProperties][property][value]" }

}

Request advice.

If the structure of the XML is fixed and you only need a few elements then you could use

    mutate {
        add_field => {
            "BatteryHealthCode" => "%{[xmldata][command][sendTerminalProperties][group][2][property][1][value]}"
            "ChargeCapacity" => "%{[xmldata][command][sendTerminalProperties][group][2][property][2][value]}"
        }
    }

We used the similar concept and could parse , Thanks a lot Badger , you have been a great help !!! Keep rocking ! :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.