Extract all jsons from log to one field

Hello, I'm trying to extract few possible jsons from log and put them to one field into an array.
This is the example of log I want to parse:

This is one of example logs: [{"lat":12.33,"lng":55.44}] that I should correctly parse [{"lat":12.33,"lng":55.44}]. It can contain multiple jsons [{"lat":12.33,"lng":55.44}].

The acceptance criteria is an field visible in kibana which will contain all these jsons put in one field for example "extracted_objects": [ {..}, {..}, {..} ].
I'm able to parse the first json using following config:

grok {
    match => { "message" => "%{CISCO_REASON}: \[%{GREEDYDATA:java_object_json}\].?" }
}

json {
    source => "java_object_json"
    target => "java_object"
}

but I hope this not the way I should follow... Because these jsons can be nested, it's hard to solve it using regexp.

Does this solve your issue?

ruby {
  code => "
    require 'json'
    new_field = JSON.parse(event.get('message'))
    event.set('new_field', new_field)
  "
}

This assumes your input message is like [{"lat":12.33,"lng":55.44}].

If that array is a part of your message (like whatever you want before [{"lat":12.33,"lng":55.44}] whatever you want after) first grok that part out in a field and then pass that field to the JSON.parse ruby function .

With this syntax, a pipeline like

input {
  stdin{}
}

filter {
  ruby {
    code => "
      require 'json'
      new_field = JSON.parse(event.get('message'))
      event.set('new_field', new_field)
    "
  }
}

output {
  stdout{}
}

Would act like this:

Is this what you're looking for?

I can't determine how the log will look like. Log can be like:

Test logging java object: [{"lat":12.33,"lng":55.44}] hjklgfdjhkl [{"lat":12.33,"lng":55.44}]

It just can contain multiple jsons inside plain text.

and your config will fail parsing that. What I really need is to have ability to look for a json pattern in the log and capture all matches. And then put all these matches into one field (array).

Well, my first thought would be to tell whoever is generating this logs to fix them upstream since they cannot be that random.

Anyway, are the json object ALWAYS surrounded by square brackets? Is there any chance curly brackets are present in the log without them surrounding a json object?
I mean, is somthing like the following a possible log?

{"no_square":"valid_json"} qsdfwerg {invalid-->json} [{"valid":"json"}] qsfdw

Square brackets are not required. It's just by convention. It could be changed to < > or anything else if that will help.

No, I meant, all those json objects are always surrounded by square brackets (or anything else you choose)? Or is there any chance that a json object is outside the brackets?

There is no chance. Printing an object (json) should be wrapped inside some brackets to make it more visible. And I can define something other than square brackets, because now it makes every json to be an array.

Ok so, IF the json objects in input are ALWAYS inside square brackets, this should do what you need:

input {
  stdin{}
}

filter {
  ruby {
    code => "
      require 'json'
      def valid_json?(json)
        JSON.parse(json)
          return true
        rescue JSON::ParserError => e
          return false
      end
      possible_jsons = event.get('message').scan(/(?<=\[).+?(?=\])/)
      jsons = possible_jsons.map{|item| JSON.parse(item) if valid_json?(item)}.compact
      event.set('new_field', jsons)
    "
  }
}

output {
  stdout{}
}

With this pipeline, here's the input-output
image

It's fine but works only for jsons without arrays. I changed surrounding square brackets to < > and now it works correctly even with nested jsons with arrays. So your code works great, I need only to find or create better regexp to handle surrounding square brackets. Thank you for help.

No problem!

Should you need help with the regex just ask.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.