Filebeat - Decoding Json logs with Linebreaks


(Ali M) #1

I was able to make FileBeat work with json log files. However, when a log message contains line breaks, the json parser can not decode the json.

For example, the following log line can not be decoded correctly:

{"timestamp" :"2018-06-18 20:28:22.121", "message": "line1 \nline2 \nline3", "level":"info"}

Is it there a way to fix this problem in filebeat?


(Andrew Kroh) #2

Does the log message on disk contain \n or a newline? What error does Filebeat give while parsing? And what configuration are you using.

Using the exact JSON you posted above works for me.

filebeat.prospectors:
- paths: [input.json]
  json.keys_under_root: true

output.console.pretty: true
{
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.2.3"
  },
  "@timestamp": "2018-06-19T15:20:31.260Z",
  "beat": {
    "hostname": "macbook",
    "name": "macbook",
    "version": "6.2.3"
  },
  "level": "info",
  "message": "line1 \nline2 \nline3",
  "offset": 93,
  "source": "/Users/akroh/go/src/github.com/elastic/beats/filebeat/.test/json-newline/input.json",
  "timestamp": "2018-06-18 20:28:22.121"
}

(Ali M) #3

The logfile itself contains line breaks (not just \n) and json decoding fails.

Here is my filebeat config:

filebeat.prospectors:
- type: log
  enabled: true
  paths:
    - /log/*.log  
  multiline.pattern: '^{'
  multiline.negate: true
  multiline.match: after
  processors:
     - decode_json_fields:
         fields: ["timestamp", "message", "level"]
         process_array: false
         max_depth: 1
         target: ""
         overwrite_keys: true

Here is the error message:

Invalid format: \"line2 \"
Invalid format: \"line3\", \"level\":\"info\"}\"

It look like the the line break cause the json decoder to receive part of of line (as opposed to the whole line ) and fail.


(Andrew Kroh) #4

If the string value contains line feeds then it's not valid JSON because all control characters must be escaped (see rfc 7151 section 7).

Filebeat (decode_json_fields) can handle pretty printed JSON where the object spans multiple lines, but it cannot handle this case where the string values contain control characters.

One possible solution is to do the multiline in Filebeat and the JSON decoding in Logstash. Prior to the JSON filter you could replace the line feeds with \n or \u000a.

Or you could modify the thing writing the logs to do JSON escaping.


(Ali M) #5

Make sense. Thanks. Is there a way to pre-process messages in FileBeat to replace line break withs with '\n' without using LogStash? ( By using pipelines, etc)

Log message -> Replace line breaks with \n -> Decode Json -> Pretty print in Kibana.


(Andrew Kroh) #6

Without LS... you could probably accomplish it with an Ingest Node pipeline that uses gsub and then json.

Once you create the pipeline you add it to your prospector config. See pipeline.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.