JSON codec polluting logs with ERROR messages if input is plain string

Ivan_Smurygin · June 5, 2019, 12:30pm

Hello.
I have a question about

Generally we use the codec and it works fine. Our application has logs like

{
    "profile": "dev",
    "logging-context": "default",
    "hostname": "xxx",
    "app": "xxx",
    "@timestamp": "2019-06-05T09:36:30.469+00:00",
    "level": "INFO",
    "logger_name": "xxx",
    "thread_name": "main",
    "message": "Logstash is awesome :)"
}

Sometimes in our DEV environment we enable some JVM profiling logs, which are producing plain strings. For example, GC logs or some additional info about JVM attributes to track performance issues.

AS documentation describes:

If this codec recieves a payload from an input that is not valid JSON, then it will fall back to plain text and add a tag _jsonparsefailure . Upon a JSON failure, the payload will be stored in the message field.

And also it produces a lot of ERROR messages in logstash logs, like

[ERROR] 2019-06-05 11:45:50.347 [[main]<cloudwatch_logs] json - JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unrecognized token 'VM': was expecting ('true', 'false' or 'null')

at [Source: (String)"VM settings:"; line: 1, column: 3]>, :data=>"VM settings:"}

So when we are printing some GC info, the logstash log is filled up with with ERROR messages and become too large (>10 GB) that causes our server to stop.
Is it a way to somehow tune the behaviour?
For example, do not print these messages into log file, or somehow skip plain text messages before passing them to json codec

pastechecker · June 5, 2019, 1:18pm

I have found one option:

bin/logstash [OPTIONS]
--log.format FORMAT           Specify if Logstash should write its own logs in JSON form (one
                              event per line) or in plain text (using Ruby's Object#inspect)
                               (default: "plain")

Maybe it will help you.

Badger · June 5, 2019, 1:37pm

The skip_on_invalid_json option of the json filter does exactly that.

Ivan_Smurygin · June 5, 2019, 1:42pm

Possibly it will help. So I need to provide additional filter

  json {
    skip_on_invalid_json => true
    source => ???
  }

as my whole input string is a source, how I can point it to json filter plugin? As source is required

UPDATE 1.
After some research I understand as this won't help. Even if I will handle the whole input text as source it will place the whole JSON into message.
But json-codec generate additional fields for each input field in JSON (message is included)
So result message in my elastic search cluster will be

"_source": {
    "profile": "dev",
    "message": "Logstash is awesome :)",
    "level": "INFO",
    "hostname": "xxx",
    "logger_name": "xxx",
    "app": "xxx",
    "@timestamp": "2019-06-05T09:36:30.469+00:00"
}

Ivan_Smurygin · June 5, 2019, 2:26pm

This one only change the format of log messages, but doesn't anyhow filter or catch ERRORs, or I misunderstood something?

Badger · June 5, 2019, 3:46pm

Do not use a codec, use a json filter instead, with "message" as the source.

system · July 3, 2019, 3:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Constant JSON parse errors in logstash output log Logstash	3	6894	April 19, 2017
Logstash JSON codec Logstash	3	1146	July 6, 2017
JSON parse error, original data now in message field, even though its a valid json Logstash	2	14003	October 23, 2018
Unable to avoid JSON parsing errors in logstash log-file Logstash	2	7909	June 9, 2017
Json codec: blank lines make logstash throw errors Logstash	2	1345	July 6, 2017

JSON codec polluting logs with ERROR messages if input is plain string

Related topics