Nested Json Array parse failure

I am having trouble parsing json log file that contains nested array. I am trying to add the values in the array after "docs" as their own searchable field linked to the target docs. Here is sample log below:

{"dba_server":"dbaserver","dba_version":"123.4.0.1123610.1729","docs":[{"childproc_count":0,"cmdline":"psexec.exe  -i -s C:\\WINDOWS\\system32\\mmc.exe /s taskschd.msc","crossproc_count":1,"filemod_count":4,"host_type":"workstation","last_update":"2020-09-30T11:43:58.808Z","modload_count":52,"netconn_count":0,"os_type":"windows","parent_guid":"00013235-0000-2c90-01d6-971ee9af8eba","parent_name":"cmd.exe","parent_pid":11408,"parent_segment_id":"1","parent_unique_id":"00013235-0000-2c90-01d6-971ee9af8eba-000000000001","path":"c:\\users\\xxxx\\documents\\2-work\\c-programs\\pstools\\psexec.exe","process_guid":"00013235-0000-15f4-01d6-971efba829ba","process_md5":"27304B246C7D5B4E149124D5F93C5B01","process_name":"psexec.exe","process_pid":5620,"process_sha256":"3337e3875b05e0bfba69ab926532e3f179e8cfbf162ebb60ce58a0281437a7ef","regmod_count":1,"segment_id":"1601466945474","start":"2020-09-30T11:43:58.092Z","unique_id":"1234asdf35-0000-15f4-01d6-97asdfba829ba-0174dasdf4fc2","username":"asdf\\xxxx","watchlist_1797":"2020-09-30T11:50:03.222527Z","watchlist_tag":"1797|asdf.suspicious.asdf"}],"process_guid":"00013235-0000-15f4-01d116-9712341efbasdf9asdf","process_id":"00013235-0000-15f4-01d6-971efba829ba","segment_id":"1","server_name":"xx-xxx.xxx.edu","timestamp":1601466945.474,"type":"watchlist.storage.hit.process","watchlist_id":1711234,"watchlist_name":"xxx.asdf.asdf"}

Here is my config file(I've tried several different changes but this is the latest):

filter {
        if [event_type] == "asdf" {
                grok {
                        match => { "message" => '{"dba_server":"%{WORD:dba_server}","dba_version":"%{DATA:dba_version}","docs":\[{"%{GREEDYDATA:msg}"}\],"%{GREEDYDATA:msg1}"}'}
                }
                json {
                        source => "msg"
                        target => "docs"
                        remove_field => "msg"
                }
                json {
                        source => "msg1"
                        remove_field => "msg1"
                }
        }
}

This config results in jsonparsefailure. Any ideas on what I am doing wrong here?

Neither [msg] nor [msg1] are valid JSON, since they are not surrounded by {}

Why not just parse the whole message?

filter { json { source => "message" remove_field => [ "message" ] } }

@Badger
What you suggested was my first attempt and even though I don't get a jsonparsefailure, the parsed key:values doesn't show in Kibana as an available field but what's confusing is that I can see the key:values as filter options in the search field in Kibana. Is there a particular reason why that is happening?

Also thanks for the reply

Here is a few screenshots of what I am referring to. First Below is the updated conf file....

filter {
        if [event_type] == "abc" {
                json {
                    source => "message"
                    remove_field => [ "timestamp", "message" ]
                }
        }
}

Here is the output in Kibana of the available fields but not the fields in the array:

Here is a screenshot of the expanded document that shows the array under the docs field:

And here is a screenshot of the search field showing the filters:

Neither "key" nor "value" appears in your sample JSON, so I do not understand why you would expect them to appear in the event.

Maybe I am using the wrong words and my apologies if that's the case. I am not saying the actual word "key" or "value" is missing, what I am referring to, for example, from this sample log file:

{"dba_server":"dbaserver","dba_version":"6.4.0.1asdf0.17234","docs":[{"childproc_count":0,"cmdline":"fltmc  unload paritydriver","comms_ip":"xx.xxx.xx.xx","computer_name":"xxxx-xxxx-xxx","crossproc_count":1,"emet_count":0,"filemod_count":0,"filtering_known_dlls":false,"group":"default group","host_type":"asdf","hostname":"xxxx-xxxx-xxx","id":"00xxxxb-00asdf01210-0e1asdfc-01d6-91f655def5e8","interface_ip":"xx.xxx.xx.xx","last_server_update":-1,"last_update":"2020-09-23T22:10:24.355Z","modload_count":28,"netconn_count":0,"os_type":"xxx","parent_guid":"0002479b-0000-06f4-01d6-91db3a5ec28d","parent_id":"0002479b-0000-06f4-01d6-91db3a5ec28d","parent_md5":"00000000000000000000000000000000","parent_name":"cmd.exe","parent_pid":1780,"parent_segment_id":"1","parent_unique_id":"0002479b-0000-06f4-01d6-91db3a5ec28d-000000000001","path":"c:\\windows\\system32\\xxx.exe","process_guid":"000asdf2asdfaa4asfd79b-0000-0e1c-01d6-91f655def5e8","process_md5":"BCACasdfasdfASDFASFREWTYFGEED1DCE5619C63D21F7112","process_name":"xxxmc.exe","process_pid":3612,"process_sha256":"@!#$HNJKZX94530ASDF8c1220319cbb69e2a2370334d7849304eb3bb7b1d2fd811c4324e","processblock_count":0,"regmod_count":0,"segment_id":"1601497621219","sensor_id":149403,"start":"2020-09-23T22:10:24.266Z","unique_id":"0002479b-0000-0e1c-01d6--0174e0b262e3","username":"xxxx\\xxxx"}],"highlights_by_doc":{"0002479b-0000-0e1c-01d6-91f655def5e8-0174e0b262e3":["xxxx  PREPREPREunloadPOSTPOSTPOST paritydriver","c:\\windows\\system32\\PREPREPRExxxx.exePOSTPOSTPOST"]},"server_name":"xxx-xxxx","timestamp":1601498166.457215,"type":"watchlist.xxx.xxx","watchlist_id":1111,"watchlist_name":"xxxx"}

I am not seeing the values from the array in the available fields area in Kibana but I can search for those fields via KQL. Take the "comms_ip" from the sample log...I can search against this field and get the correct documents...

But I don't see the "docs.comm_ip" in the available fields area

Did you refresh the index pattern in Kibana?

Yes I actually deleted the index before update filter conf file and I also refreshed pattern but still not seeing them in available fields.

I wonder if that is related to the fact that docs is an array.

You could try either

mutate { replace => { "docs" => "%{[docs][0]}" } }

or

split { field => "docs" }

to see if it then works the way you want. If it does then you would be in a position to ask a question around arrays in the kibana forum.

Finally got a chance to revisit this and your 2nd suggestion(split filter) is what solved my problem. Thanks again for your help as always.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.