I have a message contains Unicode Escape Sequence
I want convert it to UTF-8 character with my country language (VIetnamese)
Input is from filebeat filestream
I use logstash to parse the message:
\u0043\u1ea3\u006d\u0020\u01a1\u006e\u0020\u0071\u0075\u00fd\u0020\u006b\u0068\u00e1\u0063\u0068
Expect result is:
"Cảm ơn quý khách"
I have write simple ruby script and test and it work:
require 'uri'
message = "\u0043\u1ea3\u006d\u0020\u01a1\u006e\u0020\u0071\u0075\u00fd\u0020\u006b\u0068\u00e1\u0063\u0068"
enc_uri = URI.decode_www_form_component(message)
p enc_uri
But when i push it in to ruby filter in logstash and i puts the result out to testing, it's not work
filter {
ruby {
init => "require 'uri'"
code => "
@enc_uri = enc_uri = URI.decode_www_form_component(event.get('message'))
puts @enc_uri
"
}
}
Unexpected results:
## This line , expect: `"Cảm ơn quý khách"`
\u0043\u1ea3\u006d\u0020\u01a1\u006e\u0020\u0071\u0075\u00fd\u0020\u006b\u0068\u00e1\u0063\u0068
{
"message" => "\\u0043\\u1ea3\\u006d\\u0020\\u01a1\\u006e\\u0020\\u0071\\u0075\\u00fd\\u0020\\u006b\\u0068\\u00e1\\u0063\\u0068",
"event" => {
"original" => "\\u0043\\u1ea3\\u006d\\u0020\\u01a1\\u006e\\u0020\\u0071\\u0075\\u00fd\\u0020\\u006b\\u0068\\u00e1\\u0063\\u0068"
},
"ecs" => {
"version" => "8.0.0"
},
"input" => {
"type" => "filestream"
},
"agent" => {
"type" => "filebeat",
"ephemeral_id" => "69ccd3be-66c2-45ab-8ac8-e585698c7a0a",
"name" => "2285d6af9a56",
"version" => "8.5.2",
"id" => "a009634c-6ee6-487b-8d2b-87cf5c0cd7ec"
},
"host" => {
"name" => "2285d6af9a56"
},
"@version" => "1",
"log" => {
"file" => {
"path" => "/var/log/test/api.log"
},
"type" => "api",
"offset" => 63342
},
"@timestamp" => 2022-12-03T08:33:53.754Z,
"biz" => true,
"tags" => [
[0] "beats_input_codec_plain_applied"
]
}
Please help me explain this, and how to make it work