I'm calling an API from a script. The script then runs the results through "jq" to rearrange the fields properly which then outputs to a .json file. In Logstash I'm using File input to read the file and trying to output to Elasticsearch. I've run the json file through jsonlint and it is valid but I'm receiving the following error when I output to stdout:
The file input reads files line by line. If you want to slurp an entire file into a single event you need to use a multiline codec. I'm sure examples have been posted in the past.
Since the output file that is read is actually just created from a bash script could I just simply run the script and pipe results directly to Logstash? This would negate the need to create and read a separate file.
Pushing to Elasticsearch worked BUT the entire array went in as one single document. What I need is each "id" separately. I originally tried the EXEC logstash input which used CAT on the file. The output from this looked perfect meaning each "ID" with its associated information on a separate line. In your opinion might this be a better approach?
Sure, using an exec input is probably more straightforward than a file input. Of course, you don't get any tracking of what data you've processed but that's perhaps not a problem.
After further examining the output looks like the it's "pretty printed" with a newline character (\n) after each and every line including those with just a "[" and "{". What would be the best way to deal with this?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.