Split xml in multiple events


(Benjamin Carriou) #1

Hello !

I know that this subject has been treated several times but I can not solve my problem.

Logstash version: 2.4
Split plugin version: 3.1.2

Here is an example of xml that I want split:

<?xml version='1.0' encoding='utf-8'?>
<S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/">
<S:Body>
	<fl:FlightListByAerodromeReply xmlns:fw="eurocontrol/cfmu/b2b/FlowServices" xmlns:as="eurocontrol/cfmu/b2b/AirspaceServices" xmlns:fl="eurocontrol/cfmu/b2b/FlightServices" xmlns:cm="eurocontrol/cfmu/b2b/CommonServices" xmlns:ns0="http://www.fixm.aero/base/4.0" xmlns:ns2="http://www.fixm.aero/flight/4.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
		<requestReceptionTime>2017-07-27 14:58:12</requestReceptionTime>
		<requestId>B2B_CUR:36666587</requestId>
		<sendTime>2017-07-27 14:58:12</sendTime>
		<status>OK</status>
		<data>
			<flights>
				<flight>
					<flightId>
						<id>AT00859043</id>
						<keys>
							<aircraftId>EZY49QU</aircraftId>
							<aerodromeOfDeparture>LEMD</aerodromeOfDeparture>
							<nonICAOAerodromeOfDeparture>false</nonICAOAerodromeOfDeparture>
							<airFiled>false</airFiled>
							<aerodromeOfDestination>LFPG</aerodromeOfDestination>
							<nonICAOAerodromeOfDestination>false</nonICAOAerodromeOfDestination>
							<estimatedOffBlockTime>2017-07-27 08:04</estimatedOffBlockTime>
						</keys>
					</flightId>
				</flight>
			</flights>
			<flights>
				<flight>
					<flightId>
						<id>AT00853607</id>
						<keys>
							<aircraftId>TAR8713</aircraftId>
							<aerodromeOfDeparture>LFPG</aerodromeOfDeparture>
							<nonICAOAerodromeOfDeparture>false</nonICAOAerodromeOfDeparture>
							<airFiled>false</airFiled>
							<aerodromeOfDestination>DTTJ</aerodromeOfDestination>
							<nonICAOAerodromeOfDestination>false</nonICAOAerodromeOfDestination>
							<estimatedOffBlockTime>2017-07-27 09:55</estimatedOffBlockTime>
						</keys>
					</flightId>
				</flight>
			</flights>
			<effectiveTrafficWindow>
				<wef>2017-07-27 10:00</wef>
				<unt>2017-07-27 11:00</unt>
			</effectiveTrafficWindow>
		</data>
	</fl:FlightListByAerodromeReply>
</S:Body>
</S:Envelope>

Here is the Logstash configuration I use:

input {
  tcp {
     port => 7001
     type => "FlightListByAerodromeReply"
  }
  filter {
      if [type] == "FlightListByAerodromeReply" {
         xml {
             source => "message"
             #store_xml => false
             target => "parsed"
             #force_array => false
         }
         split {
             field => "parsed[Body]"
         }
         split {
            field => "parsed[Body][FlightListByAerodromeReply]"
         }
         split {
             field => "parsed[Body][FlightListByAerodromeReply][data]"
         }
         split {
             field => "parsed[Body][FlightListByAerodromeReply][data][effectiveTrafficWindow]"
         }
         split {
            field => "parsed[Body][FlightListByAerodromeReply][data][flights]"
         }
         split {
            field => "parsed[Body][FlightListByAerodromeReply][data][flights][flight]"
         }
         split {
            field => "parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId]"
         }
         split {
           field => "parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId][keys]"
         }
        mutate {
          add_field => { flightId => "%{parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId][id]}" }
          add_field => { aircraftId => "%{parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId] [keys][aircraftId]}" }
          add_field => { aerodromeOfDestination => "%{parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId][keys][aerodromeOfDestination]}" }
          add_field => { estimatedOffBlockTime => "%{parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId][keys][estimatedOffBlockTime]}" }
          add_field => { airFiled => "%{parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId][keys][airFiled]}" }
          add_field => { nonICAOAerodromeOfDestination => "%{parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId][keys][nonICAOAerodromeOfDestination]}" }
          add_field => { nonICAOAerodromeOfDeparture => "%{parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId][keys][nonICAOAerodromeOfDeparture]}" }
          add_field => { aerodromeOfDeparture => "%{parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId][keys][aerodromeOfDeparture]}" }
          add_field => { wef => "%{parsed[Body][FlightListByAerodromeReply][data][effectiveTrafficWindow][wef]}" }
          add_field => { unt => "%{parsed[Body][FlightListByAerodromeReply][data][effectiveTrafficWindow][unt]}" }
          add_field => { requestReceptionTime => "%{parsed[Body][FlightListByAerodromeReply][requestReceptionTime]}" }
          add_field => { sendTime => "%{parsed[Body][FlightListByAerodromeReply][sendTime]}" }
          add_field => { status => "%{parsed[Body][FlightListByAerodromeReply][status]}" }
          remove_field => [ "message" ]
          remove_field => [ "parsed" ]
         }
   }

The following in the following post ...


(Benjamin Carriou) #2

Here is the result I get:

Only String and Array types are splittable. field:parsed[Body][FlightListByAerodromeReply][data][flights][flight] is of type = Hash {:level=>:warn}
Only String and Array types are splittable. field:parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId] is of type = Hash {:level=>:warn}
Only String and Array types are splittable. field:parsed[Body][FlightListByAerodromeReply][data][flights][flight][flightId][keys] is of type = Hash {:level=>:warn}
{
                     "@version" => "1",
                   "@timestamp" => "2017-07-28T08:40:56.208Z",
                         "host" => "192.168.10.160",
                         "port" => 5269,
                         "type" => "FlightListByAerodromeReply",
                     "flightId" => "AT00871821",
                   "aircraftId" => "AFR181J",
       "aerodromeOfDestination" => "EIDW",
        "estimatedOffBlockTime" => "2017-07-28 10:35",
                     "airFiled" => "false",
"nonICAOAerodromeOfDestination" => "false",
  "nonICAOAerodromeOfDeparture" => "false",
         "aerodromeOfDeparture" => "LFPG",
                          "wef" => "2017-07-28 10:00",
                          "unt" => "2017-07-28 11:00",
         "requestReceptionTime" => "2017-07-28 08:40:55",
                     "sendTime" => "2017-07-28 08:40:55",
                       "status" => "OK"
}
{
                     "@version" => "1",
                   "@timestamp" => "2017-07-28T08:40:56.208Z",
                         "host" => "192.168.10.160",
                         "port" => 5269,
                         "type" => "FlightListByAerodromeReply",
                         "tags" => [
    [0] "_split_type_failure"
],
                     "flightId" => "AT00871821",
                   "aircraftId" => "AFR181J",
       "aerodromeOfDestination" => "EIDW",
        "estimatedOffBlockTime" => "2017-07-28 10:35",
                     "airFiled" => "false",
"nonICAOAerodromeOfDestination" => "false",
  "nonICAOAerodromeOfDeparture" => "false",
         "aerodromeOfDeparture" => "LFPG",
                          "wef" => "2017-07-28 10:00",
                          "unt" => "2017-07-28 11:00",
         "requestReceptionTime" => "2017-07-28 08:40:55",
                     "sendTime" => "2017-07-28 08:40:55",
                       "status" => "OK"
}
[...]

We can see that there are splitting problems and that if I have 50 flights, it will take the information of the last flight and replace them for the previous 49 flights.

I can use ruby ​​code to retrieve the total number of flights:

ruby {
    code => "
      event['count'] = event['parsed']['Body'][0]['FlightListByAerodromeReply'][0]['data'][0]['flights'].length
    "
}

But how to iterate on the number of flights?
I tried this but without success:

[...] 
split {
  field => "parsed[Body][FlightListByAerodromeReply][data][flights][%{count}]"
}
[...]

Thankx for your help !


(Magnus Bäck) #3

Could you give an example of the desired result for the example XML document above? It's not clear to me what the end goal is, and the sequence of split filters is confusing.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.