Broken encoding of Windows EventLog messages

On localized windows we have eventlog messages with non-english letters which logstash unable to reproduce in right way.

{
          "@timestamp" => 2017-02-14T03:42:20.105Z,
             "message" => "\xD1\xEB\xF3\xE6\xE1\xE0 VSS \xE2\xFB\xEA\xEB\xFE\xF7\xE0\xE5\xF2\xF1\xFF \xE8\xE7-\xE7\xE0 \xF2\xE0\xE9\xEC-\xE0\xF3\xF2\xE0 \xEF\xF0\xEE\xF1\xF2\xEE\xFF.\r\n\xEF\xF0\xE8\xE2\xE5\xF2Z1",
                "type" => "Win32-EventLog",

    "InsertionStrings" => [
        [0] "\xEF\xF0\xE8\xE2\xE5\xF22Z1"
    ],
            /// other fields removed ///
}

message that supposed:

{
          "@timestamp" => 2017-02-14T03:42:20.105Z,
             "message" => "Служба VSS выключается из-за тайм-аута простоя. приветZ",
                "type" => "Win32-EventLog",

    "InsertionStrings" => [
        [0] "привет2Z1"
    ],
            /// other fields removed ///
}

so as we can see all non english letters in string fields are over translated ...

logstash config:

input {
      eventlog {
        type  => 'Win32-EventLog'
        logfile  => 'Application'
      }
    }
    output {
        stdout { codec => rubydebug }
    }

Microsoft Windows 10 [Version 10.0.14393] Russian

C:\logstash\logstash.bat --version

logstash 5.2.0
jruby 1.7.25 (1.9.3p551) 2016-04-13 867cb81 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_111-b14 +jit [Windows 10-amd64]
java 1.8.0_111 (Oracle Corporation)
jvm Java HotSpot(TM) 64-Bit Server VM / 25.111-b14

C:\logstash\bin>logstash-plugin.bat install logstash-input-eventlog

Validating logstash-input-eventlog
Installing logstash-input-eventlog
Installation successful

C:\logstash\bin>logstash-plugin.bat update logstash-input-eventlog

Updating logstash-input-eventlog
No plugin updated

same problem was reproduced on Windows Server 2008r2 russian

some additions:

C:\logstash\bin>logstash-plugin.bat list --verbose logstash-input-eventlog
logstash-input-eventlog (4.0.2)

codes from string "\xEF\xF0\xE8..." are equal to chars codes of WINDOS ANSI CODEPAGE (in my case it is "CP1251" but it depends on locale)
i suppose we have to make next:

  1. convert string '\EF' -> to bytes
  2. convert bytes as current windows ANSI CODEPAGE to logstash encoding (utf-8 ?)
  3. do it for all text fields

it is possible that on some system something can does wrong and it well be best to add options

  • set codepage explicity
  • remove translation

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.