The original data is avro data, but only plain => { charset => "BINARY" } is used for debugging to reduce interference. avro data original (hexadecimal) 00000000: 00 80 40 00 cc e1 85 98 0e 00 3a 32 30 32 34 2d ..@.......:2024- 00000010: 30 34 2d 31 38 54 30 36 3a 33 32 3a 30 34 2e 30 04-18T06:3…

When using logstash to consume binary data, some bytes are replaced with ef bf bd

Badger May 6, 2024, 12:45pm 2

That tells the codec on the input to translate the data from BINARY to UTF-8. The code always outputs UTF-8 because that was all the filters in the pipeline expect.

The ED BD BD is the UTF-8 encoding of the replacement character uFFFD. See also this thread.

Perhaps try the ASCII-8BIT encoding?

Topic		Replies	Views
Logstash-codec-avro parsing wrong values Logstash	1	296	August 9, 2019
Logstash Avro Kafka Codec issue Logstash	1	307	March 1, 2020
Non-optional base64 encoding in Avro codec Logstash	2	1378	September 27, 2017
An unknown error occurred sending a bulk request to Elasticsearch. We will retry indefinitely {:error_message=>"\"\\xB0\" from ASCII-8BIT to UTF-8" Logstash docker	2	2322	March 25, 2020
Filebaet and logstash encoding problem Logstash docker	16	1166	May 17, 2023

When using logstash to consume binary data, some bytes are replaced with ef bf bd

Related topics