Parsing netflow (XML) logs, problems with encoding

Hi all,
I have some netflow logs in .xml that I need to parse with logstash. All is well until I reach payload field, which has differnet char set (?) and logstash crashes.
This is the error message returned:

Error parsing xml with XmlSimple {:source=>"message", :value=>"<FlowForm><icmpType>127</icmpType><destinationPort>7777</destinationPort><flowType>0</flowType><destinationTCPFlagsDescription>P,A</destinationTCPFlagsDescription><sourceDSCP>0</sourceDSCP><customProps>{dns_flow_flag=0, File_Name=N/A, HTTP Host=N/A, File_Hash=N/A}</customProps><destinationPrecedence>0</destinationPrecedence><lastPacketDateTime>5 Jun 2020, 23:42:54</lastPacketDateTime><srcPortInvalid>false</srcPortInvalid><appName>Misc.ttc</appName><destinationTOS>Best Effort</destinationTOS><magnitude>7</magnitude><dstPortInvalid>false</dstPortInvalid><sourceV6Ip>0:0:0:0:0:0:0:0</sourceV6Ip><sourcePrecedence>0</sourcePrecedence><totalDestinationPackets>13</totalDestinationPackets><destinationPayloadAsUTF>\u0000\b\u00008\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u000F\u0000\u0010\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u00007\u0000\u0000\u0000\u0001\u0000\u0000\u0000\u0001\u0000\f\u0000\u000F\u0000\u0010\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u00007\u0000\u0000</destinationPayloadAsUTF><categoryDescription>Misc</categoryDescription><category>18438</category><sourcePayloadAsUTF>\u0000\u0002\u0000\u0000\u0000\eexecute procedure WS_INIT()\u0000\u0000\u0016\u00001\u0000\f\u0000\u0004\u0000\u0000\a\u0000\f\u0000\u0004\u0000\u0000\v\u0000\f\u0000\u0002\u0000\u0002\u0000+ex</sourcePayloadAsUTF><firstPacketTime>5 Jun 2020, 23:40:52</firstPacketTime><sourceTOS>Best Effort</sourceTOS><direction>L2L</direction><eventName>Misc.ttc</eventName><sourcePort>59792</sourcePort><sourceTCPFlagsDescription>P,A</sourceTCPFlagsDescription><compoundAppID>42060</compoundAppID><sourcePayloadAsHexOneLine>00 02 00 00 00 1b 65 78 65 63 75 74 65 20 70 72 6f 63 65 64 75 72 65 20 57 53 5f 49 4e 49 54 28 29 00 00 16 00 31 00 0c 00 04 00 0d 00 07 00 0c 00 04 00 0d 00 0b 00 0c 00 02 00 02 00 2b 65 78</sourcePayloadAsHexOneLine><protocol>6</protocol><flowTypeDescription>Standard Flow</flowTypeDescription><tLVProperties><LinkedHashMap><empty>false</empty></LinkedHashMap></tLVProperties><totalDestinationBytes>2907</totalDestinationBytes><destinationTCPFlags>24</destinationTCPFlags><directionDescription>L2L</directionDescription><startTime>1591389652515</startTime><flowInterfaceName>napatech0</flowInterfaceName><protocolName>tcp_ip</protocolName><mPCEvent>false</mPCEvent><relevance>10</relevance><flowSensorName>SIEMfs2</flowSensorName><appId>42060</appId><totalSourceBytes>1306</totalSourceBytes><srcIp>10.180.17.4</srcIp><destinationPayloadAsHexOneLine>00 08 00 38 00 0d 00 00 00 00 00 00 00 00 00 00 00 0f 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 37 00 00 00 01 00 00 00 01 00 0c 00 0f 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 37 00 00</destinationPayloadAsHexOneLine><icmpTypeDescription>Type 127, Code 127</icmpTypeDescription><destinationPayloadAsBase64>AAgAOAANAAAAAAAAAAAAAAAPABAAAAAAAAAAAAAAAAAANwAAAAEAAAABAAwADwAQAAAAAAAAAAAAAAAAADcAAA==</destinationPayloadAsBase64><destinationDSCP>0</destinationDSCP><totalSourcePackets>15</totalSourcePackets><stopTime>1591389832515</stopTime><credibility>10</credibility><sourceTCPFlags>24</sourceTCPFlags><domainName>Default Domain</domainName><domainID>0</domainID><flowIdentifier>0</flowIdentifier><severity>2</severity><qid>53264753</qid><startDateTime>5 Jun 2020, 23:40:52</startDateTime><icmpCode>127</icmpCode><destIp>10.180.1.2</destIp><totalBytes>4213</totalBytes><sourcePayloadAsBase64>AAIAAAAbZXhlY3V0ZSBwcm9jZWR1cmUgV1NfSU5JVCgpAAAWADEADAAEAA0ABwAMAAQADQALAAwAAgACACtleA==</sourcePayloadAsBase64><stopDateTime>5 Jun 2020, 23:43:52</stopDateTime><destinationV6Ip>0:0:0:0:0:0:0:0</destinationV6Ip><sensorInterfaceId>1</sensorInterfaceId></FlowForm>", :exception=>#<REXML::ParseException: #<RuntimeError: Illegal character "\u0000" in raw string 87                                                                                                                                                                                                                                                7"> 

My logstash config:

filter {
 if "flows" in [tags] {
    xml {
      source => "message"
      target => "xml"
    }
 }
}

For the logstash input I am using default default codec (plain) with "UTF-8" charset.

Any help would be appreciated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.