Mapping IDs in XML with Logstash

Hi all,

I''ve this chat text in XML format, which i want to insert into Elasticsearch using Logstash.

<?xml version="1.0"?>
<chatTranscript>
	<newParty userId="123">
		<userInfo userNick="A" userType="CLIENT"/>		
	</newParty>
	<message userId="123">
		<msgText>This is just a text from A</msgText>
	</message>
	<newParty userId="456" >
		<userInfo userNick="B" userType="AGENT" />
	</newParty>

	<message userId="456" >
		<msgText>This is a text from B</msgText>
	</message>
		
	<newParty userId="789">
		<userInfo userNick="C" userType="AGENT"/>
	</newParty>
	<message userId="789">
		<msgText>This is a text from C</msgText>
	</message>
	
	<message userId="123" >
		<msgText>This is a reply from A</msgText>
	</message>
	
	<message userId="789">
		<msgText>This is another text from C</msgText>
	</message>
		
</chatTranscript>

My problem is now, that I don't want to put i userIds into Elasticsearch, but the real nick names, which are defined not per message, but once in this "newparty" XML element.

When I just store the mentioned XML with the pure store_xml XML filter:

input { stdin {
}
}

filter {

	xml {
		source => "message"
		target => "doc"
		
		store_xml => true
		force_array => false
		remove_namespaces =>true   	

	
	}
	
	mutate
	{
		remove_field => [ "message","host" ]
	}
}


output { stdout { codec => rubydebug } }

The mapped JSON looks like this:

{
    "@timestamp" => 2018-11-02T15:31:14.901Z,
      "@version" => "1",
           "doc" => {
         "message" => [
            [0] {
                 "userId" => "123",
                "msgText" => "This is just a text from A"
            },
            [1] {
                 "userId" => "456",
                "msgText" => "This is a text from B"
            },
            [2] {
                 "userId" => "789",
                "msgText" => "This is a text from C"
            },
            [3] {
                 "userId" => "123",
                "msgText" => "This is a reply from A"
            },
            [4] {
                 "userId" => "789",
                "msgText" => "This is another text from C"
            }
        ],
        "newParty" => [
            [0] {
                "userInfo" => {
                    "userNick" => "A",
                    "userType" => "CLIENT"
                },
                  "userId" => "123"
            },
            [1] {
                "userInfo" => {
                    "userNick" => "B",
                    "userType" => "AGENT"
                },
                  "userId" => "456"
            },
            [2] {
                "userInfo" => {
                    "userNick" => "C",
                    "userType" => "AGENT"
                },
                  "userId" => "789"
            }
        ]
    }
}

But I want that the userId 123, 456, 789 are directly replaced by the nickNames, which are attributes above.

Any idea how to solve this?

Regards,
Christian

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.