JSON codec polluting logs with ERROR messages if input is plain string

Hello.
I have a question about
https://www.elastic.co/guide/en/logstash/current/plugins-codecs-json.html

Generally we use the codec and it works fine. Our application has logs like

{
    "profile": "dev",
    "logging-context": "default",
    "hostname": "xxx",
    "app": "xxx",
    "@timestamp": "2019-06-05T09:36:30.469+00:00",
    "level": "INFO",
    "logger_name": "xxx",
    "thread_name": "main",
    "message": "Logstash is awesome :)"
}

Sometimes in our DEV environment we enable some JVM profiling logs, which are producing plain strings. For example, GC logs or some additional info about JVM attributes to track performance issues.

AS documentation describes:

If this codec recieves a payload from an input that is not valid JSON, then it will fall back to plain text and add a tag _jsonparsefailure . Upon a JSON failure, the payload will be stored in the message field.

And also it produces a lot of ERROR messages in logstash logs, like

[ERROR] 2019-06-05 11:45:50.347 [[main]<cloudwatch_logs] json - JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unrecognized token 'VM': was expecting ('true', 'false' or 'null')

at [Source: (String)"VM settings:"; line: 1, column: 3]>, :data=>"VM settings:"}

So when we are printing some GC info, the logstash log is filled up with with ERROR messages and become too large (>10 GB) that causes our server to stop.
Is it a way to somehow tune the behaviour?
For example, do not print these messages into log file, or somehow skip plain text messages before passing them to json codec

I have found one option:

bin/logstash [OPTIONS]
--log.format FORMAT           Specify if Logstash should write its own logs in JSON form (one
                              event per line) or in plain text (using Ruby's Object#inspect)
                               (default: "plain")

Maybe it will help you.

The skip_on_invalid_json option of the json filter does exactly that.

Possibly it will help. So I need to provide additional filter

  json {
    skip_on_invalid_json => true
    source => ???
  }

as my whole input string is a source, how I can point it to json filter plugin? As source is required

UPDATE 1.
After some research I understand as this won't help. Even if I will handle the whole input text as source it will place the whole JSON into message.
But json-codec generate additional fields for each input field in JSON (message is included)
So result message in my elastic search cluster will be

"_source": {
    "profile": "dev",
    "message": "Logstash is awesome :)",
    "level": "INFO",
    "hostname": "xxx",
    "logger_name": "xxx",
    "app": "xxx",
    "@timestamp": "2019-06-05T09:36:30.469+00:00"
}

This one only change the format of log messages, but doesn't anyhow filter or catch ERRORs, or I misunderstood something?

Do not use a codec, use a json filter instead, with "message" as the source.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.