Prettified json is not parsed by logstash s3 input plugin

Anubha · May 19, 2021, 11:32am

We're trying to parse multiline json file from an s3 bucket which results in "_jsonparsefailure".
It reads the file line by line. The codec we're using is json_lines.

Our logstash config looks like this:

input {
      s3 {
        bucket => "${S3_BUCKET_NAME}"
        region => "${AWS_REGION}"
        codec => json_lines
      }
    }
    filter {
      split {
        field => "fieldName"
      }
    }
output { # elasticsearch config
}

The file has only one json object and the fieldName in this case is an array inside the json object.

Badger · May 19, 2021, 4:10pm

The json_lines codec expects each line to be a complete JSON object. If your JSON object is pretty-printed across multiple lines you will need to use a multiline codec.

Anubha · May 19, 2021, 5:14pm

@Badger Thanks for your reply. With large files, wouldn't multiline codec reach a limit? I remember it was breaking down the file since it was too large and then the json would then not make sense

Badger · May 19, 2021, 5:48pm

The multiline codec has options to set the limit on the number of bytes and lines that can be combined. The defaults are 500 lines and 10 megabytes. You are free to increase them if you need to.

Anubha · May 19, 2021, 6:01pm

Great! Thanks a lot @Badger

system · June 16, 2021, 6:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unable to parse multiline json data into logstash Logstash	12	984	October 22, 2021
File input JSON with multiline codec plugin Logstash	7	3261	December 3, 2019
Varying amount of JSON parse errors when parsing the same files Logstash	8	339	July 27, 2018
Nested Json parse failure in logstash? Logstash	9	1407	September 12, 2019
How to properly parse multi line json? Logstash	6	3780	June 20, 2019

Prettified json is not parsed by logstash s3 input plugin

Related topics