Logstash XML parsing issues - trying to send to Graylog, part 2

So, in this thread, I was able to get an initial config working, but ran into an issue parsing the data in that there are different events, which results in different data fields. Is there a way to handle these fields dynamically? If not, I assume it will be with if/else conditions in the filter, but I couldn't seem to get that to work. Any input is appreciated.
Current config I have

input {
  file {
    path => "/tmp/cifs_audit/*.xml"
    start_position => "beginning"
    type => "cifs_audit"
  }
}

filter {
	xml {
	    source => "message"
	    target => "xml_content"
	    remove_namespaces => true
	    store_xml => true
	    force_array => false
		xpath => [ "/Event/System/EventID/text()", "System.EventID" ]
		xpath => [ "/Event/System/EventName/text()", "System.EventName" ]
		xpath => [ "/Event/System/Result/text()", "System.Result" ]
		xpath => [ "/Event/System/Channel/text()", "System.Channel" ]
		xpath => [ "/Event/System/Computer/text()", "System.Computer" ]
	}
	if [System.EventID] == "4663" and [System.EventName] == "Get Object Attributes" {
		xml {
		    source => "message"
		    target => "xml_eventcontent"
		    remove_namespaces => true
		    store_xml => true
		    force_array => false
			xpath => [ "/Event/EventData//Data[1]@Name", "EventData.SubjectIP" ]
			xpath => [ "/Event/EventData//Data[3]@Name", "EventData.SubjectUserSid" ]
			xpath => [ "/Event/EventData//Data[4]@Name", "EventData.SubjectUserIsLocal" ]
			xpath => [ "/Event/EventData//Data[5]@Name", "EventData.SubjectDomainName" ]
			xpath => [ "/Event/EventData//Data[6]@Name", "EventData.SubjectUserName" ]
			xpath => [ "/Event/EventData//Data[7]@Name", "EventData.ObjectServer" ]
			xpath => [ "/Event/EventData//Data[8]@Name", "EventData.ObjectType" ]
			xpath => [ "/Event/EventData//Data[9]@Name", "EventData.HandleID" ]
			xpath => [ "/Event/EventData//Data[10]@Name", "EventData.ObjectName" ]
			xpath => [ "/Event/EventData//Data[11]@Name", "EventData.InformationRequested" ]
		}
	}
	else if [System.EventID] == "4656" and [System.EventName] == "Open Object" {
		xml {
		    source => "message"
		    target => "xml_eventcontent"
		    remove_namespaces => true
		    store_xml => true
		    force_array => false
			xpath => [ "/Event/EventData//Data[1]@Name", "EventData.SubjectIP" ]
			xpath => [ "/Event/EventData//Data[3]@Name", "EventData.SubjectUserSid" ]
			xpath => [ "/Event/EventData//Data[4]@Name", "EventData.SubjectUserIsLocal" ]
			xpath => [ "/Event/EventData//Data[5]@Name", "EventData.SubjectDomainName" ]
			xpath => [ "/Event/EventData//Data[6]@Name", "EventData.SubjectUserName" ]
			xpath => [ "/Event/EventData//Data[7]@Name", "EventData.ObjectServer" ]
			xpath => [ "/Event/EventData//Data[8]@Name", "EventData.ObjectType" ]
			xpath => [ "/Event/EventData//Data[9]@Name", "EventData.HandleID" ]
			xpath => [ "/Event/EventData//Data[10]@Name", "EventData.ObjectName" ]
			xpath => [ "/Event/EventData//Data[11]@Name", "EventData.AccessList" ]
			xpath => [ "/Event/EventData//Data[12]@Name", "EventData.AccessMask" ]
			xpath => [ "/Event/EventData//Data[13]@Name", "EventData.DesiredAccess" ]
			xpath => [ "/Event/EventData//Data[14]@Name", "EventData.Attributes" ]
		}
	}
}

output {
  if [type] == "cifs_audit" {
    gelf {
      host => "graylog.host"
      port => 12201
    }
  }
}

Yes, I know xpath => [ "/Event/EventData//Data[2]@Name" is missing, it's on purpose.

It seems wasteful to parse the xml twice. I certainly would not store it twice. If you look at the parsed event data you basically have an array of objects that contain [Name] and [content] fields.

    "EventData" => {
        "Data" => [
            [ 0] {
                "IPVersion" => "4",
                     "Name" => "SubjectIP",
                  "content" => "********"
            },
            [ 1] {
                "Local" => "false",
                 "Name" => "SubjectUnix",
                  "Uid" => "65534",
                  "Gid" => "65534"
            },
            [ 2] {
                   "Name" => "SubjectUserSid",
                "content" => "********"
            },
            [ 3] {
                   "Name" => "SubjectUserIsLocal",
                "content" => "false"
            },

You could use a ruby filter similar to this to flatten that into a hash.

@Badger so, swap out the if/else statements for the following?

	ruby {
	    code => '
	      event.get("[xml_content]").each { |a|
	        name = a["Name"]
	        value = a["Content"]
	        event.set( "[xml_content]#{name}", value)
	      }
	    '
  	}

I tried it, and got the following:

...
[ERROR] 2019-07-31 13:09:12.013 [[main]>worker1] ruby - Ruby exception occurred: no implicit conversion of String into Integer
[ERROR] 2019-07-31 13:09:12.013 [[main]>worker1] ruby - Ruby exception occurred: no implicit conversion of String into Integer
[ERROR] 2019-07-31 13:09:12.013 [[main]>worker1] ruby - Ruby exception occurred: no implicit conversion of String into Integer
[ERROR] 2019-07-31 13:09:12.014 [[main]>worker1] ruby - Ruby exception occurred: no implicit conversion of String into Integer
...

Not quite

ruby {
    code => '
      event.get("[xml_content][EventData][Data]").each { |a|
        name = a["Name"]
        value = a["content"]
        #event.set( "[someField][#{name}]", value)
        event.set( "[eventData.#{name}]", value)
      }
    '
}

You have the option whether to put the fields inside [someField] or to prefix the names with eventData. which your original config did.

2 Likes

That was it, thanks!!!!!