Unable to decode gzip compressed http response from Packetbeat file output

We are currently trying to capture SOAP request and response body content using Packetbeat 7.1.1 (As shown in in screenshot - PacketbeatOutputConfig). Packetbeat is configured to output to file (as shown in screenshot upload - PacketbeatProtocolConfig)

The response body is gzipped and are looking to save the compressed response to file and decode it are ourselves as we except a high number of responses. We have compared the output from packetbeat (Screenshot1) with output form TCPDump (Screenshot2) and have circled the hex where the compression starts and they don't match. This looks like its the reason we can't decode the response.

Is there any why of getting the actual compressed response (as shown in the TCPDump screenshot) from the network packet so it can be decompressed from the packetbeat output file

We noticed similar issues on stackoverflow that haven't been resolved:


PacketbeatProtocolConfig PacketbeatOutputConfig
Screenshot2

Update on issue: We found that leaving decode_body: false and changing the codec.format we were able to get the compressed response.

codec.format:
string: '%{[http][response][body][content]}'

This example will return the payload as a string not in json so bytes above 0x7f are not replaced by the Unicode replacement character. When the output is consumed (to a file called output) it can be decompressed using following command:

zcat output

There's a few problems here.

One is that in TCPDump (the first screenshot) you're seeing the raw HTTP stream, which includes the Chunked transfer-encoding.

Packetbeat has already removed that from the payload.

Second is that, when the Packetbeat data is encoded as JSON there is some transformation of the raw body to UTF and the bytes above 0x7f are replaced by the Unicode replacement character (\ufffd).

I don't think there's an easy workaround for this other than using Packetbeat to also decode the gzip content-encoding.

Feel free to open an enhancement request in the Beats repo so we can consider using the binary datatype for bodies.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.