Error in parsing Pretty Json Data

stephenb · December 29, 2021, 5:26pm

Another possible approach is to simply turn your data into ndjson (new line delimited json) using jq found here then you would not need to worry about the multiline etc.

It is a super powerful json tool...

Say your file looks like this ( I put 2 in but it will work with 1 to n)

$ cat json-pretty-sample.json 
[
  {
    "year": 2013,
    "title": "Rush",
    "info": {
      "directors": [
        "Ron Howard"
      ],
      "release_date": "2013-09-02T00:00:00Z",
      "rating": 8.3,
      "genres": [
        "Action",
        "Biography",
        "Drama",
        "Sport"
      ],
      "image_url": "http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg",
      "plot": "A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.",
      "rank": 2,
      "running_time_secs": 7380,
      "actors": [
        "Daniel Bruhl",
        "Chris Hemsworth",
        "Olivia Wilde"
      ]
    }
  },
  {
    "year": 2015,
    "title": "Other Movie",
    "info": {
      "directors": [
        "Ron Howard"
      ],
      "release_date": "2013-09-02T00:00:00Z",
      "rating": 8.3,
      "genres": [
        "Action",
        "Biography",
        "Drama",
        "Sport"
      ],
      "image_url": "http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg",
      "plot": "A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.",
      "rank": 2,
      "running_time_secs": 7380,
      "actors": [
        "Daniel Bruhl",
        "Chris Hemsworth",
        "Olivia Wilde"
      ]
    }
  }
]

I can simply run jq and tell it to write ndjson

This command says... jq output in compact form (ndjson) -c and write all the elements within the top array .[]

$ cat json-pretty-sample.json | jq -c .[] > sample.ndjson

and now the file will be ndjson which is what logstash can easily read without the multiline code

$ cat sample.ndjson 
{"year":2013,"title":"Rush","info":{"directors":["Ron Howard"],"release_date":"2013-09-02T00:00:00Z","rating":8.3,"genres":["Action","Biography","Drama","Sport"],"image_url":"http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg","plot":"A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.","rank":2,"running_time_secs":7380,"actors":["Daniel Bruhl","Chris Hemsworth","Olivia Wilde"]}}
{"year":2015,"title":"Other Movie","info":{"directors":["Ron Howard"],"release_date":"2013-09-02T00:00:00Z","rating":8.3,"genres":["Action","Biography","Drama","Sport"],"image_url":"http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg","plot":"A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.","rank":2,"running_time_secs":7380,"actors":["Daniel Bruhl","Chris Hemsworth","Olivia Wilde"]}}

This conf read the ndjson file and parses everything just fine.
Note I used the json codec to read the file.. I am often confused between the two json and json_lines but for reading from a file with ndjson ... codec => "json" gets the job done... you can also use the filter with json but this is pretty direct.

input {
  file {
    path => "/Users/sbrown/workspace/sample-data/discuss/logstash/sample.ndjson"
    start_position => "beginning"
    codec => "json"
    sincedb_path => "/dev/null"
  }
}

output {
	stdout { codec => rubydebug }
}

Output

{
          "year" => 2015,
         "title" => "Other Movie",
      "@version" => "1",
          "info" => {
                   "genres" => [
            [0] "Action",
            [1] "Biography",
            [2] "Drama",
            [3] "Sport"
        ],
                     "rank" => 2,
        "running_time_secs" => 7380,
                   "actors" => [
            [0] "Daniel Bruhl",
            [1] "Chris Hemsworth",
            [2] "Olivia Wilde"
        ],
             "release_date" => "2013-09-02T00:00:00Z",
                "image_url" => "http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg",
                "directors" => [
            [0] "Ron Howard"
        ],
                   "rating" => 8.3,
                     "plot" => "A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda."
    },
    "@timestamp" => 2021-12-29T17:59:23.458Z,
          "host" => "hyperion",
          "path" => "/Users/sbrown/workspace/sample-data/discuss/logstash/sample.ndjson"
}
.......

Topic		Replies	Views
Logstash json filter error Logstash	4	903	October 18, 2018
Logstash JSON codec Logstash	3	1123	July 6, 2017
Parsing Json File with Logstash and Filebeat Logstash	4	1854	October 9, 2017
Error while parsing nested json on filebeat Logstash	1	390	August 31, 2022
How to use Grok for JSON parsing Logstash	3	680	February 28, 2022

Error in parsing Pretty Json Data

Related topics