Input JSON

Hey,

After trying for a long time trying to get a nested JSON working I've just created a flat one - http://pastebin.com/0VVF4HNZ.

I feel like this should be trivial but trying input codec json, filter json, some attempts at multiline (all from these and stack overflow) has got me nowhere. I could just create a script to just use XPUT to work it's way down the file as that works, but this looks exactly what Logstash was designed to do :slight_smile:.

This is my latest iteration.

input {
  file {
      path => "/root/logstash-5.0.0/flat.json"
      start_position => beginning
      ignore_older => 0
      type => "zerto"
      sincedb_path => "/dev/null"
      codec => json
     }
}

output {
  elasticsearch {
      hosts => [ "localhost:9200" ]
      index => "logstash-%{+YYYY.MM.dd}"
}
}

I just get a string of errors like:
2016-11-09T16:36:47,737][ERROR][logstash.codecs.json ] JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: incompatible json object type=java.lang.String , only hash map or arrays are suppoted>, :data=>"\t"EntityType": 3,"}

Are you producing the JSON file yourself? If yes create the file like this instead:

{"EntityType": 3,"EventCategory": "Alerts",...}
{"EntityType": 3,"EventCategory": "Alerts",...}
{"EntityType": 3,"EventCategory": "Alerts",...}
...

Thanks for the reply, I've modified the file and it works perfectly now.
I'm not creating it myself though, I'm pulling it from an API, so it'll be another script to write to get it all into one line. I guess since it's being sorted via the input there are no mutate tricks I can do as that's at the filter level?

If not I'll work with what I have and modify the log before Logstash gets hold of it. Thanks again.

Not 100% sure what you're asking, but the conversion of an array of objects into one object per line needs to take place prior to Logstash.

That was what I was asking, there are so many features like the multiline input I was hoping I could use one of them. Doing the operation before the data comes in is fine though.
Cheers.

This might not be the right forum for this, but in case other people are trying to do the same as me.

I created a quick and dirty Python file that takes nested JSON and creates an output that input => {file{codec => json}} likes.

> import json
> from os import SEEK_END
> from flatten_json import flatten_json #need to get this prior
> with open('input.json') as data_file:
>    data = json.load(data_file)

> x = flatten_json(data)  
> f=open('output.json', 'wb')

> for i in range(0, 2000):       #need to set this to the number of records (should be automated)
>    f.write('{')
>    for key in x:
>        if key.startswith(('%s_') %i):
>            strip = (str(i) + '_')
>            newkey = key.lstrip(strip)
>            f.write('"'+ newkey + '"' + ':' + '"'+ x[key] + '", ' )
>    f.seek(-2, SEEK_END) #remove last comma
>    f.truncate()
>    f.write('}\n')

> f.seek(-1, SEEK_END) #remove last CR
> f.truncate()

I'm sure there's a better way of doing it, and it's 100% unsupported :slight_smile:, but anyone else searching for a solution might find it workable.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.