Trouble with Logstash JSON parsing

I'm calling an API from a script. The script then runs the results through "jq" to rearrange the fields properly which then outputs to a .json file. In Logstash I'm using File input to read the file and trying to output to Elasticsearch. I've run the json file through jsonlint and it is valid but I'm receiving the following error when I output to stdout:

[logstash.filters.json ] Error parsing json {:source=>"message", :raw=>" }", :exception=>#<LogStash::Json::ParserError: Unexpected close marker '}': expected ']'

Here is a sample of the json input file:

[
{
"id": "7744",
"canUpdate": true,
"canDelete": true,
"canArchive": true,
"info": [{
"problem_type": null,
"Category": ""
},
{
"closure_information": 0,
"Closure Information": ""
},
{
"insert_time": 1509992459000,
"Request time": "2017-11-06 13:20:59.0"
}
]
},
{
"id": "47",
"canUpdate": true,
"canDelete": true,
"canArchive": true,
"info": [{
"problem_type": "PAN",
"Category": "PAN"
},
{
"closure_information": 1,
"Closure Information": "Solved (Permanently)"
},
{
"insert_time": 1466446314000,
"Request time": "2016-06-20 14:11:54.0"
}
]
}
]

The file input reads files line by line. If you want to slurp an entire file into a single event you need to use a multiline codec. I'm sure examples have been posted in the past.

Magnus -

Thank you for your input. I certainly have seen examples for multiline codec. Will check that out. Thank you.

Since the output file that is read is actually just created from a bash script could I just simply run the script and pipe results directly to Logstash? This would negate the need to create and read a separate file.

Sure, you could do that but I don't recommend it.

  • If Logstash hangs (e.g. because its outputs are unavailable) your bash script will also hang if it produces a lot of data and fills its pipe buffer.
  • Logstash can't be restarted while the script is running.

Makes sense. I may just look at pushing directly to Elasticsearch then. Thanks.

Pushing to Elasticsearch worked BUT the entire array went in as one single document. What I need is each "id" separately. I originally tried the EXEC logstash input which used CAT on the file. The output from this looked perfect meaning each "ID" with its associated information on a separate line. In your opinion might this be a better approach?

Sure, using an exec input is probably more straightforward than a file input. Of course, you don't get any tracking of what data you've processed but that's perhaps not a problem.

After further examining the output looks like the it's "pretty printed" with a newline character (\n) after each and every line including those with just a "[" and "{". What would be the best way to deal with this?

If you use an exec input that's not a problem and if you use a file input you need to use a multiline codec as I said earlier.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.