Message Parsing

Here is what my logstash output looks like:

output {
  elasticsearch {
    index => "%{[@metadata][beat]}"
    hosts => "192.168.0.103"
  }
}

Logstash errors are as follows:

[2020-11-18T13:08:12,824][WARN ][logstash.codecs.jsonlines][main][26da92079e525d4bfdac5a892ff28079c6695bd768a516e8a992f0d588033c05] Received an event that has a different character encoding than you configured. {:text=>"\\u000E\\x97P]...
 [2020-11-18T13:08:12,826][WARN ][logstash.codecs.jsonlines][main][26da92079e525d4bfdac5a892ff28079c6695bd768a516e8a992f0d588033c05] JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unrecognized token 'z': was expecting ('true', 'false' or 'null')
 at [Source: (String)"z -9\x92\u0001~\u0000/\f\x960l...

I added

codec => plain {
      charset => "ISO-8859-1"
    }

But am getting similar error messages:

JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unexpected character...
Received an event that has a different character encoding than you configured. {:text=>"\\xB6\\xA6}e#\\x

Hi,

I don't think it has something to do with the output - I guess it is connected to the input:

JSON parse error, original data now in message field

output plugins do not parse the data - they serialize it. Can you show us the complete pipeline?

input {
    tcp {
    port => 5044
    codec => json
    }
}

filter {
  date {
    match => [ "timeMillis", "UNIX_MS" ]
  }
}

output {
  elasticsearch {
    index => "%{[@metadata][beat]}"
    hosts => "192.168.0.103"
    codec => plain {
      charset => "ISO-8859-1"
    }
  }
}

Where do you get your data from? According to the character pages:
\xB6

The pilcrow , , also called the paragraph mark , paragraph sign , paraph , alinea , or blind P , is a typographical character marking the start of a paragraph.
](https://en.wikipedia.org/wiki/Pilcrow)

\xA6

The vertical bar , | , is a glyph with various uses in mathematics, computing, and typography. It has many names, often related to particular meanings: Sheffer stroke (in logic), verti-bar , vbar , stick , vertical line , vertical slash, bar , pike , or pipe , and several variants on these names. It is occasionally considered an allograph of broken bar (see below).

In what character encoding do you receive the data - have you tried setting the encoding on the input instead of the output?

Best regards
Wolfram

I am sending logs from one of my Linux clients. Here is part of the filebeat.yml from that client:

- type: log
  # Change to true to enable this input configuration.
  enabled: true
  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/*.log

Are you sure that the logs are in json format? That would explain the json codec errors...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.