Prettified json is not parsed by logstash s3 input plugin

We're trying to parse multiline json file from an s3 bucket which results in "_jsonparsefailure".
It reads the file line by line. The codec we're using is json_lines.

Our logstash config looks like this:

input {
      s3 {
        bucket => "${S3_BUCKET_NAME}"
        region => "${AWS_REGION}"
        codec => json_lines
      }
    }
    filter {
      split {
        field => "fieldName"
      }
    }
output { # elasticsearch config
} 

The file has only one json object and the fieldName in this case is an array inside the json object.

The json_lines codec expects each line to be a complete JSON object. If your JSON object is pretty-printed across multiple lines you will need to use a multiline codec.

@Badger Thanks for your reply. With large files, wouldn't multiline codec reach a limit? I remember it was breaking down the file since it was too large and then the json would then not make sense

The multiline codec has options to set the limit on the number of bytes and lines that can be combined. The defaults are 500 lines and 10 megabytes. You are free to increase them if you need to.

Great! Thanks a lot @Badger

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.