Best way to ingest JSON files?


(Craig Foote) #1

I have 300 files to ingest. Each is pretty-printed JSON and contains one event. I tried using the file input and the multiline codec:

input{
    file{
        path => "/tmp/ID*.json"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        codec => multiline{
            # each event starts with {"response"
            pattern => "^{\"response\".*$"
            negate => true
            what => "previous"
        }
    }
}

filter{
    json{
        source => "message"
    }
}

output{
    stdout{codec=>rubydebug}
}

But it doesn't display anything. In debug, it seems to wait at the end of each file for the determination of event end, i.e. waiting for the beginning of a new event before releasing each file's event.

What's the best thing to do here? Do I have to somehow get un-prettyfied versions of the files or concatenate them all together (losing last event due to multiline waiting forever with last event)? Anyone see any alternatives? Any help appreciated.


(Joe Lawson) #2

Have you tried codec => json{}


(Craig Foote) #3

No I haven't. I thought it was for single-line json. I ended up concatenating the files together :frowning:


(Craig Foote) #4

I just tried the JSON codec and got _jsonparsefailures. I don't think it works for multiline JSON.


(Andrew Cholakian) #5

That's correct craig. It's somewhat tricky to parse non-newline separated JSON and we don't yet support that. If you can reprocess the concatenated JSON with another tool and output it as one per line that will make things easier.


(Craig Foote) #6

Thanks Andrew. I'll work on that.


(system) #7