Error in parsing Pretty Json Data

Hi team

I have file in json format.

[
    {
        "year": 2013,
        "title": "Rush",
        "info": {
            "directors": ["Ron Howard"],
            "release_date": "2013-09-02T00:00:00Z",
            "rating": 8.3,
            "genres": [
                "Action",
                "Biography",
                "Drama",
                "Sport"
            ],
            "image_url": "http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg",
            "plot": "A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.",
            "rank": 2,
            "running_time_secs": 7380,
            "actors": [
                "Daniel Bruhl",
                "Chris Hemsworth",
                "Olivia Wilde"
            ]
        }
    }
]

i want to parse this file and i am dealing with error in logstash json filter

input
{
file {
 path => "/usr/share/logstash/moviedata.json"
 start_position => "beginning"
 #sincedb_path => "/usr/share/logstash/dbteste"
}
}

filter {
json {
source => "message"
}
}

logstash logs

[2021-12-29T16:01:13,824][WARN ][logstash.filters.json    ][movie][2e0349d64f8ab2889caf33284adb464eb2adf3d4b19e77ad5922b144d2a77e57] Error parsing json {:source=>"message", :raw=>"    }", :exception=>#<LogStash::Json::ParserError: Unexpected close marker '}': expected ']' (for root starting at [Source: (byte[])"    }"; line: 1, column: 0])
 at [Source: (byte[])"    }"; line: 1, column: 6]>}
[2021-12-29T16:01:13,825][WARN ][logstash.filters.json    ][movie][2e0349d64f8ab2889caf33284adb464eb2adf3d4b19e77ad5922b144d2a77e57] Error parsing json {:source=>"message", :raw=>"                \"Action\",", :exception=>#<LogStash::Json::ParserError: Unexpected character (',' (code 44)): expected a value
 at [Source: (byte[])"                "Action","; line: 1, column: 26]>}
[2021-12-29T16:01:13,830][WARN ][logstash.filters.json    ][movie][2e0349d64f8ab2889caf33284adb464eb2adf3d4b19e77ad5922b144d2a77e57] Error parsing json {:source=>"message", :raw=>"                \"Biography\",", :exception=>#<LogStash::Json::ParserError: Unexpected character (',' (code 44)): expected a value
 at [Source: (byte[])"                "Biography","; line: 1, column: 29]>}
[2021-12-29T16:01:13,836][WARN ][logstash.filters.json    ][movie][2e0349d64f8ab2889caf33284adb464eb2adf3d4b19e77ad5922b144d2a77e57] Error parsing json {:source=>"message", :raw=>"            \"plot\": \"A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.\",", :exception=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
 at [Source: (byte[])"            "plot": "A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.","; line: 1, column: 20]>}
[2021-12-29T16:01:13,837][WARN ][logstash.filters.json    ][movie][2e0349d64f8ab2889caf33284adb464eb2adf3d4b19e77ad5922b144d2a77e57] Error parsing json {:source=>"message", :raw=>"                \"Daniel Bruhl\",", :exception=>#<LogStash::Json::ParserError: Unexpected character (',' (code 44)): expected a value
 at [Source: (byte[])"                "Daniel Bruhl","; line: 1, column: 32]>}

You need to use a multiline codec to combine the pretty-printed JSON into a single event. There is an example of that here.

Another possible approach is to simply turn your data into ndjson (new line delimited json) using jq found here then you would not need to worry about the multiline etc.

It is a super powerful json tool...

Say your file looks like this ( I put 2 in but it will work with 1 to n)

$ cat json-pretty-sample.json 
[
  {
    "year": 2013,
    "title": "Rush",
    "info": {
      "directors": [
        "Ron Howard"
      ],
      "release_date": "2013-09-02T00:00:00Z",
      "rating": 8.3,
      "genres": [
        "Action",
        "Biography",
        "Drama",
        "Sport"
      ],
      "image_url": "http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg",
      "plot": "A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.",
      "rank": 2,
      "running_time_secs": 7380,
      "actors": [
        "Daniel Bruhl",
        "Chris Hemsworth",
        "Olivia Wilde"
      ]
    }
  },
  {
    "year": 2015,
    "title": "Other Movie",
    "info": {
      "directors": [
        "Ron Howard"
      ],
      "release_date": "2013-09-02T00:00:00Z",
      "rating": 8.3,
      "genres": [
        "Action",
        "Biography",
        "Drama",
        "Sport"
      ],
      "image_url": "http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg",
      "plot": "A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.",
      "rank": 2,
      "running_time_secs": 7380,
      "actors": [
        "Daniel Bruhl",
        "Chris Hemsworth",
        "Olivia Wilde"
      ]
    }
  }
]

I can simply run jq and tell it to write ndjson

This command says... jq output in compact form (ndjson) -c and write all the elements within the top array .[]

$ cat json-pretty-sample.json | jq -c .[] > sample.ndjson

and now the file will be ndjson which is what logstash can easily read without the multiline code

$ cat sample.ndjson 
{"year":2013,"title":"Rush","info":{"directors":["Ron Howard"],"release_date":"2013-09-02T00:00:00Z","rating":8.3,"genres":["Action","Biography","Drama","Sport"],"image_url":"http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg","plot":"A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.","rank":2,"running_time_secs":7380,"actors":["Daniel Bruhl","Chris Hemsworth","Olivia Wilde"]}}
{"year":2015,"title":"Other Movie","info":{"directors":["Ron Howard"],"release_date":"2013-09-02T00:00:00Z","rating":8.3,"genres":["Action","Biography","Drama","Sport"],"image_url":"http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg","plot":"A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.","rank":2,"running_time_secs":7380,"actors":["Daniel Bruhl","Chris Hemsworth","Olivia Wilde"]}}

This conf read the ndjson file and parses everything just fine.
Note I used the json codec to read the file.. I am often confused between the two json and json_lines but for reading from a file with ndjson ... codec => "json" gets the job done... you can also use the filter with json but this is pretty direct.

input {
  file {
    path => "/Users/sbrown/workspace/sample-data/discuss/logstash/sample.ndjson"
    start_position => "beginning"
    codec => "json"
    sincedb_path => "/dev/null"
  }
}

output {
	stdout { codec => rubydebug }
}

Output

{
          "year" => 2015,
         "title" => "Other Movie",
      "@version" => "1",
          "info" => {
                   "genres" => [
            [0] "Action",
            [1] "Biography",
            [2] "Drama",
            [3] "Sport"
        ],
                     "rank" => 2,
        "running_time_secs" => 7380,
                   "actors" => [
            [0] "Daniel Bruhl",
            [1] "Chris Hemsworth",
            [2] "Olivia Wilde"
        ],
             "release_date" => "2013-09-02T00:00:00Z",
                "image_url" => "http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg",
                "directors" => [
            [0] "Ron Howard"
        ],
                   "rating" => 8.3,
                     "plot" => "A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda."
    },
    "@timestamp" => 2021-12-29T17:59:23.458Z,
          "host" => "hyperion",
          "path" => "/Users/sbrown/workspace/sample-data/discuss/logstash/sample.ndjson"
}
.......
2 Likes

Hey @stephenb it is working now and thanks for this new command 'jq'

1 Like

thanks @Badger

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.