I want to capture the request and return of the http port, but the data returned by http contains Chinese characters and uses GBK encoding. Garbled characters appear after packetbeat is sent to elastic. I did not find the configuration on how to decode GBK in the official manual. Is there any way to solve this problem?
Please share your Packetbeat version, configuration, and an example of the HTTP transactions that you are having issues with (at least the request/response headers).
I was looking through the Packetbeat code a bit. I see that does look at the Content-Encoding and Transfer-Encoding, and then uses that to decode the body in some circumstances (like gzip and deflate). But I don't think it does conversion to UTF-8 for different charsets that are specified within the Content-Type headers.
This seems like it would be feasible to add as an enhancement given that Filebeat applies something very similar to files (Log input | Filebeat Reference [8.11] | Elastic).
I don't know if it's possible, but a potential workaround might be to convert the bytes from GBK to UTF-8 within Elasticsearch Ingest Node using a script
processor.
Packetbeat version:packetbeat-7.17.10-linux-x86_64
http configuration in packetbeat.yml:
- type: http
# Configure the ports where to listen for HTTP traffic. You can disable
# the HTTP protocol by commenting out the list of ports.
ports: [8070, 8080]
send_headers: true
decode_body: true
send_request: true
send_response: true
include_response_body_for: []
include_request_body_for: []
include_body_for: ['text/xml']
Looking forward to your reply, thanks
I pasted my reply on the 4th floor, looking forward to your reply, thank you
I don't have a solution now other than routing the data through Logstash and doing the decoding there, but I wrote up an enhancement issue for Packetbeat [Packetbeat] Support converting text encodings to UTF-8 · Issue #37598 · elastic/beats · GitHub.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.