Logstash Ruby filter not saving when fields change

I am currently trying to parse some XML with the XML plugin, and then getting the attributes with a Ruby script. When it works, the XML I am parsing looks like this

<proto name="safety">
  <field name="safety.ETY" value="0"/>
  <field name="safety.MTI" value="9"/>
  <field name="safety.direction"  value="1">
  <field name="safety.MAC"  value="aabbccddee11" />
</proto>

(I have slanted the value of the actual MAC-address with "aabbccddee11")
This is then parsed with this:

xml {
            source => "safety"
            xpath => ["/proto/field", "safety_fields"]
            target => "safety_fields"
        }

        # Ruby script for parsing the XML

        ruby {
            code => '
            fields = event.get("[safety_fields][field]")
            if fields.nil?
                event.tag("_no_fields")
                return
            end

            ss37 = {}
            fields.each do |field|
                begin
                field_name = field["name"].split(".")[1] rescue nil
                field_value = field["value"] rescue nil

                if field_name && field_value
                    ss37[field_name] = field_value
                end
                rescue => e
                event.tag("_ruby_exception")
                end
            end

            logger.info("Final ss37 hash: #{ss37}")

            event.set("ss37", ss37)

            final_field = event.get("ss37")

            logger.info("ss37 from the document AFTER insertion: #{final_field}")
            '
        }

        mutate {
            remove_field => ["safety_fields", "safety"]
        }

When the XML looks like that, it works. I get the following as an output to stdout:

"ss37" => {
              "MTI" => "9",
              "ETY" => "0",
              "MAC" => "aabbccddee11",
        "direction" => "1"
    }

Which is the desired output.

However, when the XML differs from having these 4 keys (which is the whole point I am doing this with a Ruby script), it no longer works. Like here:

<proto name="safety">
  <field name="safety.ETY" value="0" />
  <field name="safety.MTI" value="8"/>
  <field name="safety.direction" value="0"/>
  <field name="safety.disconnect_reason" value="00"/>
  <field name="safety.disconnect_sub_reason" value="0000"/>
</proto>

Formatted correctly by the Ruby script:

[2024-06-27T13:58:48,591][INFO ][logstash.filters.ruby    ][main][63cf1b77a8e5179f187d2784e221361e954226dfd0cb02edb476ee037220b788] Final ss37 hash: {"ETY"=>"0", "MTI"=>"8", "direction"=>"0", "disconnect_reason"=>"00", "disconnect_sub_reason"=>"0000"}
[2024-06-27T13:58:48,591][INFO ][logstash.filters.ruby    ][main][63cf1b77a8e5179f187d2784e221361e954226dfd0cb02edb476ee037220b788] ss37 from the document AFTER insertion: {"MTI"=>"8", "direction"=>"0", "ETY"=>"0", "disconnect_sub_reason"=>"0000", "disconnect_reason"=>"00"}

But now the "ss37" field is not present at all.

I am slightly unsure if the event as a whole is skipped when this happens, as I am not sure how to properly debug it, but searching for timestamps that are processed in the rest of the XML in another part of the config, I am not able to find a "cross-reference" for the timestamp. Which implies its skipped entirely.

This is really puzzling to me as looking at the logging output the ruby script works fine. Are there rules in Logstash which prevents an object from having different keys present than a previous document has had? Appreciate any answers.

Your xml filter sets both the xpath option and relys on the default value of store_xml (true), so firstly it process the xpath option to set safety_fields to

"safety_fields" => [
    [0] "<field name=\"safety.ETY\" value=\"0\"/>",
    [1] "<field name=\"safety.MTI\" value=\"9\"/>",
    [2] "<field name=\"safety.direction\" value=\"1\">\n  <field name=\"safety.MAC\" value=\"aabbccddee11\"/>\n</field>"
],

and then, if the XML is valid, it will (assuming the syntax error in the source XML is fixed) overwrite [safety_fields] with

"safety_fields" => {
    "field" => [
        [0] {
             "name" => "safety.ETY",
            "value" => "0"
        },
        [1] {
             "name" => "safety.MTI",
            "value" => "9"
        },
        [2] {
             "name" => "safety.direction",
            "value" => "1"
        },
        [3] {
             "name" => "safety.MAC",
            "value" => "aabbccddee11"
        }
    ],
     "name" => "safety"
},

However, if the syntax is not fixed then just the xpath result will be left in the event, in which case the ruby filter will tag the event with _no_fields and the [ss37] field will not be created.

I suggest commenting out

    mutate { remove_field => ["safety_fields", "safety"] }

until you are sure what those fields look like.

Thanks for your reply.

I've commented out the removal of the "safety" and "safety_fields". I also temporarily removed the tagging of "_no_fields" upon an empty field.

However, I can still not find the fields that are supposed to be there. I tried searching "disconnect_reason" in the output, but it is neither in any "ss37", "safety" or "safety_fields". Its just not present at all in any events, even though it is there in the XML file.

The XML file is valid. Other parts of these files are already being parsed by another logstash configuration

Edit:

This is the XPath I am using to get the relevant XML out of the document:

 xpath => [
                ...,
                "//proto[@name='safety']", "safety"
            ]

Sorry, didn't mark the above as a reply.