I have a data source which is sending in data in json format. Some of this data is held in an array, which makes it difficult to visualize. The array only holds one set of data however.
For example, the "docs" array below:
"field_1":"",
"field_2":"",
"docs":[{
"alliance_data_srstrust":[""],
"alliance_link_srstrust":"https://",
"alliance_score_srstrust":-100,
"alliance_updated_srstrust":"2014-10-07T00:29:07Z",
"childproc_count":1,"cmdline":"C:\WINDOWS\splwow64.exe 8192",
"comms_ip":"",
"computer_name":"",
"crossproc_count":2,
"filemod_count":2,
"group":"********",
"host_type":"workstation",
"hostname":"",
"id":"0000106f-0000-1aec-01d1-d161694406e2",
"interface_ip":"",
"last_update":"2016-06-28T17:20:53.101Z",
"modload_count":84,
"netconn_count":0,
"os_type":"windows",
"parent_guid":"0000106f-0000-0e88-01d1-d16166a9fb12",
"parent_md5":"000000000000000000000000000000",
"parent_name":"acrord32.exe",
"parent_pid":3720,
"parent_unique_id":"0000106f-0000-0e88-01d1-d16166a9fb12-00000001",
"path":"c:\windows\splwow64.exe",
"process_guid":"0000106f-0000-1aec-01d1-d161694406e2",
"process_md5":"127AA81343A7C6F665C22CB1293B0A90",
"process_name":"splwow64.exe",
"process_pid":6892,
"regmod_count":9,
"segment_id":1,
"sensor_id":4207,
"start":"2016-06-28T17:20:47.855Z",
"unique_id":"",
"username":"**"
}],
I want to be able to remove the name "docs" and the brackets around the "docs" array and be left with a list of fields separated by commas. I tried to do it with this filter but it generates lots of number_format_exception errors in the Logstash logs.
The data still seems to load in Elasticsearch but I don't know if it's missing any.
if '\"docs\"\:\[' in ["message"] {
mutate {
gsub => [
"message", '\,\"docs\"\:\[\{' , '\,' ,
"message", '\}\]\,' , '\,'
]
}
}
json{
source => "message"
}
Is there a better way to extract the data from a single value json array?
And what does number_format_exception mean when parsing data in Logstash?
Thank you.