XML mapping of IDs with Logstash

Hi all,

I want to insert chats into Elasticsearch and have a problem in converting the chat XML transcript to the proper JSON using logstash.

This is the XML:

<?xml version="1.0" encoding="UTF-8"?>
<chatTranscript>
	<newParty userId="123">
		<userInfo userNick="A" userType="CLIENT" />
	</newParty>
	<message userId="123">
		<msgText>This is just a text from A</msgText>
	</message>
	<newParty userId="456">
		<userInfo userNick="B" userType="AGENT" />
	</newParty>
	<message userId="456">
		<msgText>This is a text from B</msgText>
	</message>
	<newParty userId="789">
		<userInfo userNick="C" userType="AGENT" />
	</newParty>
	<message userId="789">
		<msgText>This is a text from C</msgText>
	</message>
	<message userId="123">
		<msgText>This is a reply from A</msgText>
	</message>
	<message userId="789">
		<msgText>This is another text from C</msgText>
	</message>
</chatTranscript>

This is my basic starting Logstash config to test with:
input { stdin {
}
}

filter {

	xml {
		source => "message"
		target => "doc"
		
		store_xml => true
		force_array => false
		remove_namespaces =>true   	

	
	}
	
	mutate
	{
		remove_field => [ "message","host" ]
	}
}


output { stdout { codec => rubydebug } }

This is the output of the JSON:

{
    "@timestamp" => 2018-11-02T15:31:14.901Z,
      "@version" => "1",
           "doc" => {
         "message" => [
            [0] {
                 "userId" => "123",
                "msgText" => "This is just a text from A"
            },
            [1] {
                 "userId" => "456",
                "msgText" => "This is a text from B"
            },
            [2] {
                 "userId" => "789",
                "msgText" => "This is a text from C"
            },
            [3] {
                 "userId" => "123",
                "msgText" => "This is a reply from A"
            },
            [4] {
                 "userId" => "789",
                "msgText" => "This is another text from C"
            }
        ],
        "newParty" => [
            [0] {
                "userInfo" => {
                    "userNick" => "A",
                    "userType" => "CLIENT"
                },
                  "userId" => "123"
            },
            [1] {
                "userInfo" => {
                    "userNick" => "B",
                    "userType" => "AGENT"
                },
                  "userId" => "456"
            },
            [2] {
                "userInfo" => {
                    "userNick" => "C",
                    "userType" => "AGENT"
                },
                  "userId" => "789"
            }
        ]
    }
}

What I want to achieve is, that in the message fields there isn't the userID, but the userNickName being stored to ElasticSearch.

How could I match those fields together?

Thanks in advance.

Regards,
Christian

Hi all,

Any ideas how to achieve this?

Regards,
Christian

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.