Mapping IDs in XML with Logstash


#1

Hi all,

I''ve this chat text in XML format, which i want to insert into Elasticsearch using Logstash.

<?xml version="1.0"?>
<chatTranscript>
	<newParty userId="123">
		<userInfo userNick="A" userType="CLIENT"/>		
	</newParty>
	<message userId="123">
		<msgText>This is just a text from A</msgText>
	</message>
	<newParty userId="456" >
		<userInfo userNick="B" userType="AGENT" />
	</newParty>

	<message userId="456" >
		<msgText>This is a text from B</msgText>
	</message>
		
	<newParty userId="789">
		<userInfo userNick="C" userType="AGENT"/>
	</newParty>
	<message userId="789">
		<msgText>This is a text from C</msgText>
	</message>
	
	<message userId="123" >
		<msgText>This is a reply from A</msgText>
	</message>
	
	<message userId="789">
		<msgText>This is another text from C</msgText>
	</message>
		
</chatTranscript>

My problem is now, that I don't want to put i userIds into Elasticsearch, but the real nick names, which are defined not per message, but once in this "newparty" XML element.

When I just store the mentioned XML with the pure store_xml XML filter:

input { stdin {
}
}

filter {

	xml {
		source => "message"
		target => "doc"
		
		store_xml => true
		force_array => false
		remove_namespaces =>true   	

	
	}
	
	mutate
	{
		remove_field => [ "message","host" ]
	}
}


output { stdout { codec => rubydebug } }

The mapped JSON looks like this:

{
    "@timestamp" => 2018-11-02T15:31:14.901Z,
      "@version" => "1",
           "doc" => {
         "message" => [
            [0] {
                 "userId" => "123",
                "msgText" => "This is just a text from A"
            },
            [1] {
                 "userId" => "456",
                "msgText" => "This is a text from B"
            },
            [2] {
                 "userId" => "789",
                "msgText" => "This is a text from C"
            },
            [3] {
                 "userId" => "123",
                "msgText" => "This is a reply from A"
            },
            [4] {
                 "userId" => "789",
                "msgText" => "This is another text from C"
            }
        ],
        "newParty" => [
            [0] {
                "userInfo" => {
                    "userNick" => "A",
                    "userType" => "CLIENT"
                },
                  "userId" => "123"
            },
            [1] {
                "userInfo" => {
                    "userNick" => "B",
                    "userType" => "AGENT"
                },
                  "userId" => "456"
            },
            [2] {
                "userInfo" => {
                    "userNick" => "C",
                    "userType" => "AGENT"
                },
                  "userId" => "789"
            }
        ]
    }
}

But I want that the userId 123, 456, 789 are directly replaced by the nickNames, which are attributes above.

Any idea how to solve this?

Regards,
Christian


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.