Parse JSON

Hi All,

I would like to ask you for a help with parsing data from JSON. I have a log file with huge JSON that contains non parsed parts:

dataj.AffectedItems {
"Attachments": "ATT00001.png (24733b); ATT00002.jpg (24684b); ATT00003.jpg (15299b); ATT00004.jpg (8273b); ATT00005.jpg (9061b); ATT00006.png (1564b)",
"Id": "RgAAAAB8ZguAnTGNQ4JPH2DKODg0BwBtsBPdSBEtT6dXWciKpCD7AAAAqSyqAABQYiqz+jZqRJR42ypOkanPAABJkwQ/AAAJ",
"InternetMessageId": "INX.1556c8dc8d061de33e85714224e.282a.15239.4710.16e@inxserver.com",
"ParentFolder": {
"Id": "LgAAAAB8ZguAnTGNQ4JPH2DKODg0AQBtsBPdSBEtT6dXWciKpCD7AAAAqSyqAAAB",
"Path": "\Inbox"
},
"Subject": "Subject"
}

How Can I parse those parts as well?

Expected output:

dataj.AffectedItems.Attachemnts: "ATT00001.png (24733b); ATT00002.jpg (24684b); ATT00003.jpg (15299b); ATT00004.jpg (8273b); ATT00005.jpg (9061b); ATT00006.png (1564b)",
dataj.AffectedItems.Id:"RgAAAAB8ZguAnTGNQ4JPH2DKODg0BwBtsBPdSBEtT6dXWciKpCD7AAAAqSyqAABQYiqz+jZqRJR42ypOkanPAABJkwQ/AAAJ",
dataj.AffectedItems.InternetMessageId: "INX.1556c8dc8d061de33e85714224e.282a.15239.4710.16e31c7b9981ff.bounce_7960@inxserver.com>"
dataj.AffectedItems.ParentFolder.ID:"LgAAAAB8ZguAnTGNQ4JPH2DKODg0AQBtsBPdSBEtT6dXWciKpCD7AAAAqSyqAAAB",
dataj.AffectedItems.ParentFolder.Path:"\Inbox"
},
dataj.AffectedItems.Subject: "Subject"
}

Thank you very much

Buzz

You should use a json filter. What problem do you have when you do that?

Hi Badger, thank you for answer,

i tryed something like this:

json {
source => "datajson"
target => "json"
}

json {
source => "json.AffectedItems" # probably that is problem
target => "AffectedItems"
}

But it doesnt work.

Buzz

A json filter will parse all of the nested items within the JSON. There is no need for multiple filters.

No it doesnt work in my case, example I showed is after JSON filter. It parsed a lot of items byt that one not.

Buzz

In log I can see:

[2019-11-08T16:19:16,849][WARN ][logstash.filters.json ] Error parsing json {:source=>"AffectedItems", :raw=>[{"Attachments"=>"ATT00001.png (24733b); ATT00002.jpg (24684b); ATT00003.jpg (15299b); ATT00004.jpg (8273b); ATT00005.jpg (9061b); ATT00006.png (1564b)", "Subject"=>"Subject", "InternetMessageId"=>"INX.1556c8dc8d061de33e85714224e.282a.15239.4710.16e@inxserver.com", "ParentFolder"=>{"Path"=>"\Inbox", "Id"=>"LgAAAAB8ZguAnTGNQ4JPH2DKODg0AQBtsBPdSBEtT6dXWciKpCD7AAAAqSyqAAAB"}, "Id"=>"RgAAAAB8ZguAnTGNQ4JPH2DKODg0BwBtsBPdSBEtT6dXWciKpCD7AAAAqSyqAABQYiqz+jZqRJR42ypOkanPAABJkwQ/AAAJ"}], :exception=>java.lang.ClassCastException: org.jruby.RubyArray cannot be cast to org.jruby.RubyIO}

May be there is a lot of ID fields?

Buzz

That's not valid JSON. It is using => to separate elements rather than :

What does the entire event look like before you start trying to parse it? Please use markdown to prevent the browser consuming parts of your message.

No it was error from logstash.

full JSON of my event is here:

{"GeneralTypeName":"Audit.Exchange","ClientIPAddress":"x.x.x.x","ClientInfoString":"Client=ActiveSync","ClientProcessName":null,"ClientVersion":null,"ExternalAccess":false,"InternalLogonType":0,"LogonType":0,"LogonUserSid":"S-1-5-21-XXXXXXXXXXXX-XXXXXXXX-XXXXXXXXX-XXXXXXX","MailboxGuid":"1111111111111111111111111","MailboxOwnerSid":"S-1-5-21-XXXXXXXXXXXX-XXXXXXXX-XXXXXXXXX-XXXXXXX","MailboxOwnerUPN":"upn@company.com","OrganizationName":"company.onmicrosoft.com","OriginatingServer":"VI1PR0502MB3950 (15.20.2387.009)\r\n","SessionId":null,"AffectedItems":[{"Id":"adf213as2d1f321asd32f132as1df23as1df321asd32f1","Attachments":"ATT00001.png (24733b); ATT00002.jpg (24684b); ATT00003.jpg (15299b); ATT00004.jpg (8273b); ATT00005.jpg (9061b); ATT00006.png (1564b)","InternetMessageId":"<INX.1556c8dc8d061de33e85714224e.282a.15239.4710.16@inxserver.com>","ParentFolder":{"Id":"1231231321323121321321231321321321sdfsdfsdf","Path":"\\Inbox"},"Subject":"Subject"}],"CrossMailboxOperation":false,"Folder":{"Id":"adsf213asdf123asdf132as1df32a1sdf321","Path":"\\Inbox"},"CreationTime":"2019-11-03T18:04:27","Id":"asdasdasd23123a1sd32132a1sd321","Operation":"MoveToDeletedItems","OrganizationId":"1231231231321321321231321321321","RecordType":3,"UserKey":"100300009307E306","UserType":0,"Version":"1","Workload":"Exchange","ResultStatus":"Succeeded","ObjectId":null,"UserId":"upn@company.com","ClientIP":"x.x.x.x","Scope":0}

JSON filter parse almost all except

AffectedItems:
{
"Attachments": "ATT00001.png (24733b); ATT00002.jpg (24684b); ATT00003.jpg (15299b); ATT00004.jpg (8273b); ATT00005.jpg (9061b); ATT00006.png (1564b)",
"Subject": "Subject",
"InternetMessageId": "INX.123121312313213213@inxserver.com",
"ParentFolder": {
"Path": "\Inbox",
"Id": "1231231321323121321321231321321321sdfsdfsdf"
},
"Id": "1231231321323121321321231321321321sdfsdfsdf"
}

Buzz

But a json filter does parse that.

filter { json { source => "message" remove_field => [ "message" ] } }

produces this rubydebug output

{
            "LogonType" => 0,
      "MailboxOwnerUPN" => "upn@company.com",
        "AffectedItems" => [
    [0] {
                  "Subject" => "Subject",
        "InternetMessageId" => "<INX.1556c8dc8d061de33e85714224e.282a.15239.4710.16@inxserver.com>",
              "Attachments" => "ATT00001.png (24733b); ATT00002.jpg (24684b); ATT00003.jpg (15299b); ATT00004.jpg (8273b); ATT00005.jpg (9061b); ATT00006.png (1564b)",
             "ParentFolder" => {
            "Path" => "\\Inbox",
              "Id" => "1231231321323121321321231321321321sdfsdfsdf"
        },
                       "Id" => "adf213as2d1f321asd32f132as1df23as1df321asd32f1"
    }
],
       "ExternalAccess" => false,
           "RecordType" => 3,

That error is telling that the thing you are trying to parse is not JSON. It is not JSON because it has already been parsed. For example, if I run this configuration

input { generator { count => 1 lines => [ '{"AnArray":[{"a":1}, {"b":2}]}' ] } }
filter {
    json { source => "message" }
}
output { stdout { codec => rubydebug { metadata => false } } }

I get

   "AnArray" => [
    [0] {
        "a" => 1
    },
    [1] {
        "b" => 2
    }
],

in my rubydebug output. If I then add a second json filter

json { source => "AnArray" }

I get the error

Error parsing json {:source=>"AnArray", :raw=>[{"a"=>1}, {"b"=>2}], :exception=>java.lang.ClassCastException: org.jruby.RubyArray cannot be cast to org.jruby.RubyIO}