Can't filter Unicode logs at all

Hello,

I need help to find out how to set Filebeat or Logstash or both... to be able successfully
filter logs by message from files which have set encoding as Unicode
I will provide screenshots to describe problem as best as i can.

Currently i'm using 6.4.0 version for Filebeat, Logstash, Elasticsearch and Kibana.

Logstash, Elasticsearch and Kibana are runing as docker services on one NODE.

  • Docker version 17.12.1-ce
  • OS linux Ubuntu 16.04 LTS

Filebeat is harvesting logs on Windows 10 as process and sending them to Logstash.

Here is my config for each utility.

Filebeat

filebeat.yml

filebeat.inputs:

    - type: log
      paths:
          - C:\Users\C5260750\Desktop\customlogs\*

      fields:
        level: debug
        status of machine: running
        review: 1

      multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'

      multiline.negate: true

      multiline.match: after

      output.logstash:
      hosts: ["10.55.177.60:5044"]

Logstash

logstash.config

input {
beats {
port => 5044
}
}
output {
 elasticsearch {
  hosts => "http://10.55.177.60:9200"
  user => elastic
  password => changeme
  }
stdout {
codec => rubydebug
}
}

logstash.yml

http.host: "0.0.0.0"
path.config: /usr/share/logstash/pipeline
xpack.monitoring.elasticsearch.url: [http://10.55.177.60:9200](http://10.55.177.60:9200/)
xpack.monitoring.elasticsearch.username: elastic
xpack.monitoring.elasticsearch.password: changeme

Elasticsearch

Default

Kibana

kibana.yml

server.name: kibana
server.host: "0"
elasticsearch.url: http://10.55.177.60:9200
elasticsearch.username: elastic
elasticsearch.password: changeme
xpack.monitoring.ui.container.elasticsearch.enabled: true

Sample of log

2018-08-20 12:39:32.232321 Sql NoteLgAlw I Tec Transaction Started, Nested level: 1, MVCC Start Timestamp: 23249765 # # TID=5868 __DBMC_TransactionManager.h 318 Customized=0

File encoding Unicode

Same Log in Logstash

Same Log in Kibana

Here is how log is presented in kibana

BUT, when i try filter by message, this will happend.

2018-08-30_13-55-42

NO MATCH AT ALL.

I cant filter by massage at all if file from which is log harvested has set encoding as unicode

Here is behavior which i expect

Same log same configs only thing that's different is encoding of file from which log is harvested

But i cant use this as solution. The encoding of file must stay as Unicode.

I tried to resolve this by set

encoding plain
.
.
.
.encoding utf-8

in filebeat.yml

also tried set Logstash codec several variants

codec => plain { charset => "UTF-8" }
codec => plain { charset => "UTF-16" }
codec => plain { charset => "ASCII" }
.
.
.
codec => plain { charset => "ISO-8859-*" }

But no matter what i try or what i do result is still same...

Can anyone please help me with this one ?

Thank you!

Well, unfortunately "Unicode" isn't actually an encoding so it's not clear what kind of file you're actually getting from Windows. Judging by https://stackoverflow.com/questions/13894898/unicode-file-in-notepad I suggest you try utf-16le in your Filebeat configuration.

1 Like

Thanks for your suggestion.
When I set encoding to utf-16 not utf-16le in my filebeat.yml and remove codec => plain { charset => "UTF-16" } from logstash.conf it's resolved my issue. Funny, because i tried it before and it didn't work at all. but it was with 6.2.0 version of filebeat... never mind

So thanks again .

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.