UTF-8 in Windows

Hello,

There seems to be a problem with how UTF-8 input is handled in Logstash (5.5.0) on Windows (2008 R2/2012 R2); using the below config, any arabic input comes out as question marks ??????????, and this seems independent of the input plugin (I tried 'file' and 'beats' inputs) and codecs (JSON/PLAIN).

input {
    file {
	path => "C:\ELK\temp\input.txt"
    }
}
output {
    file {
	path => "C:\ELK\temp\output.txt"
    }
}

Using {charset => ["CP1252"]} as proposed in this discussion does fix the issue, even though the input is UTF-8.

Strangely, the above config works as expected in Linux without specifying the CP1252 'charset'!!!

Any thoughts on this are appreciated.

Thanks

EDIT: I did some testing and it seems this issue was introduced in v5.0.0, it worked as expected in v2.4.0.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.