I've been trying to use various input and filter plugins to take a JSON file containing an array of prettified json objects and spit out an event for each object. I've scoured SO and this forum and found a couple of things that have me almost there but am hoping you could all shed some light on my issue.
This will work when my json file contains an array of one object but I'd like it to work with multiple, ideally when they're in a comma-separated array. I've tried using the split operator using field => "message" and while I think Ruby could be helpful here, I don't know how to send one object at a time, squash it into one line, and then pass it through the json filter to turn it into its own event.
Any help would be greatly appreciated. Thanks in advance!
All right so I managed to figure it out with Ruby, but was hoping Logstash would do more of it out of the box. For anyone else who tries to do the same thing, here's my configuration:
There is a little known side effect of setting an impossible delimiter in both the file input and the json_lines codec - it means the whole file is read as the payload to the json_lines codec. The json_line codec will create an event from each object in your JSON array. However, bear in mind the potential memory usage when the whole file is read into memory.
Thank you for both for your replies! I actually encountered some issues as you mentioned when trying to load the whole file into memory, so I modified my conf accordingly. Leaving it here for future reference in case anyone else needs it!
input {
file {
codec => multiline {
pattern => "^\s\s{"
negate => true
what => previous
auto_flush_interval => 1
}
path => "path/to/json"
start_position => "beginning"
sincedb_path => "dev/null"
}
}
filter {
mutate {
gsub => ["message", "\[", ""]
gsub => ["message", "\]", ""]
gsub => ["message", "^\s\s},", "}"]
}
# The first message from the input is "" so we shouldn't use the json filter on that
if ["message"] != "" {
json {
source => "message"
}
}
mutate {
remove_field => ["message"]
}
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.