XML mapping of IDs with Logstash


#1

Hi all,

I want to insert chats into Elasticsearch and have a problem in converting the chat XML transcript to the proper JSON using logstash.

This is the XML:

<?xml version="1.0" encoding="UTF-8"?>
<chatTranscript>
	<newParty userId="123">
		<userInfo userNick="A" userType="CLIENT" />
	</newParty>
	<message userId="123">
		<msgText>This is just a text from A</msgText>
	</message>
	<newParty userId="456">
		<userInfo userNick="B" userType="AGENT" />
	</newParty>
	<message userId="456">
		<msgText>This is a text from B</msgText>
	</message>
	<newParty userId="789">
		<userInfo userNick="C" userType="AGENT" />
	</newParty>
	<message userId="789">
		<msgText>This is a text from C</msgText>
	</message>
	<message userId="123">
		<msgText>This is a reply from A</msgText>
	</message>
	<message userId="789">
		<msgText>This is another text from C</msgText>
	</message>
</chatTranscript>

This is my basic starting Logstash config to test with:
input { stdin {
}
}

filter {

	xml {
		source => "message"
		target => "doc"
		
		store_xml => true
		force_array => false
		remove_namespaces =>true   	

	
	}
	
	mutate
	{
		remove_field => [ "message","host" ]
	}
}


output { stdout { codec => rubydebug } }

This is the output of the JSON:

{
    "@timestamp" => 2018-11-02T15:31:14.901Z,
      "@version" => "1",
           "doc" => {
         "message" => [
            [0] {
                 "userId" => "123",
                "msgText" => "This is just a text from A"
            },
            [1] {
                 "userId" => "456",
                "msgText" => "This is a text from B"
            },
            [2] {
                 "userId" => "789",
                "msgText" => "This is a text from C"
            },
            [3] {
                 "userId" => "123",
                "msgText" => "This is a reply from A"
            },
            [4] {
                 "userId" => "789",
                "msgText" => "This is another text from C"
            }
        ],
        "newParty" => [
            [0] {
                "userInfo" => {
                    "userNick" => "A",
                    "userType" => "CLIENT"
                },
                  "userId" => "123"
            },
            [1] {
                "userInfo" => {
                    "userNick" => "B",
                    "userType" => "AGENT"
                },
                  "userId" => "456"
            },
            [2] {
                "userInfo" => {
                    "userNick" => "C",
                    "userType" => "AGENT"
                },
                  "userId" => "789"
            }
        ]
    }
}

What I want to achieve is, that in the message fields there isn't the userID, but the userNickName being stored to ElasticSearch.

How could I match those fields together?

Thanks in advance.

Regards,
Christian


#2

Hi all,

Any ideas how to achieve this?

Regards,
Christian


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.