Fixing Mapping for Objects in Array (objects in arrays are not well supported)

daniel_a · July 22, 2019, 11:13pm

I started ingesting audit logs from Google Cloud, and I'm getting "Objects in arrays are not well supported" notifications in Kibana for all arrays found in logs.

What would be the best solution to fix the issue?

Changing data type from an object to nested?
-- Kibana won't support this, not sure if this is even a good solution.
A parent-child relationship?
-- Slower performance.
Denormalize data?
-- Adding more documents to the current index.

And I choose #3 denormalize data, how should I go about doing it for the following document?

    [  
{
"protoPayload": {
  "@type": "type.googleapis.com/google.cloud.audit.AuditLog",
  "status": {},
  "authenticationInfo": {
    "principalEmail": "some_email@gmail.com"
  },
  "requestMetadata": {
    "callerIp": "127.1.1.1",
    "callerSuppliedUserAgent": "Windows 10",
    "requestAttributes": {
      "time": "2019-01-01T20:25:12.030662677Z",
      "auth": {}
    },
    "destinationAttributes": {}
  },
  "serviceName": "iam.googleapis.com",
  "methodName": "google.iam.admin.v1.ListServiceAccountKeys",
  "authorizationInfo": [
    {
      "resource": "projects/-/serviceAccounts/55558855585858855",
      "permission": "iam.serviceAccountKeys.list",
      "granted": true,
      "resourceAttributes": {
        "name": "projects/-/serviceAccounts/55558855585858855"
      }
    }
  ],
  "resourceName": "projects/-/serviceAccounts/55558855585858855",
  "request": {
    "@type": "type.googleapis.com/google.iam.admin.v1.ListServiceAccountKeysRequest",
    "name": "projects/test_environment",
    "key_types": [
      1
    ]
  },
  "response": {
    "@type": "type.googleapis.com/google.iam.admin.v1.ListServiceAccountKeysResponse"
  }
},
"insertId": "9890890890ddd",
"resource": {
  "type": "service_account",
  "labels": {
    "project_id": "devops",
    "unique_id": "55558855585858855",
    "email_id": "elastic_devops@email.me"
  }
},
"timestamp": "2019-01-01T20:25:11.920552117Z",
"severity": "INFO",
"logName": "projects/logging/project/devops",
"receiveTimestamp": "2019-01-01T20:25:13.057955601Z"
}
]

Larry_Gregory · July 23, 2019, 11:37am

To remove the arrays from the data, you could do something like this (taking authorizationInfo an an example):

"authorizationInfo": [
    {
      "resource": "projects/-/serviceAccounts/55558855585858855",
      "permission": "iam.serviceAccountKeys.list",
      "granted": true,
      "resourceAttributes": {
        "name": "projects/-/serviceAccounts/55558855585858855"
      }
    },
   {
      "resource": "projects/-/serviceAccounts/sfasfasfasdfasdf",
      "permission": "iam.serviceAccountKeys.list",
      "granted": true,
      "resourceAttributes": {
        "name": "projects/-/serviceAccounts/sfasfasfasdfasdf"
      }
   }
  ],

becomes

"authorizationInfo.0": 
    {
      "resource": "projects/-/serviceAccounts/55558855585858855",
      "permission": "iam.serviceAccountKeys.list",
      "granted": true,
      "resourceAttributes": {
        "name": "projects/-/serviceAccounts/55558855585858855"
      }
    },
"authorizationInfo.1": 
    {
      "resource": "projects/-/serviceAccounts/sfasfasfasdfasdf",
      "permission": "iam.serviceAccountKeys.list",
      "granted": true,
      "resourceAttributes": {
        "name": "projects/-/serviceAccounts/sfasfasfasdfasdf"
      }
    },

daniel_a · July 23, 2019, 4:45pm

@Larry_Gregory what's the best way of doing it? Writing a script or is there an easy way of splitting arrays in your data set?

Would Split filter plugin help here?
https://www.elastic.co/guide/en/logstash/current/plugins-filters-split.html

daniel_a · July 23, 2019, 8:02pm

@Larry_Gregory I tried to use the Split filter plugin to flatten arrays, but I'm getting an error.

In the logs, I have a new tag: "_split_type_failure"

In the Logstash: "[2019-07-23T19:59:44,006][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[authorizationInfo] is of type = NilClass"

input { 
      google_pubsub {
      project_id => "testing"
      topic => "test_topic"
      subscription => "logstash-sub"
      include_metadata => true
      codec => "json"
}
# optional, but helpful to generate the ES index and test the plumbing
heartbeat {
    interval => 10
    type => "heartbeat"
  }
}
filter {
# don't modify logstash heartbeat events
if [type] != "heartbeat" {
    mutate {
        add_field => { "messageId" => "%{[@metadata][pubsub_message][messageId]}" }
    }
  }
}
filter {
 if [type] != "heartbeat" {
    split {
            field => "[authorizationInfo]"
    }
  }
}
output
{
stdout { codec => rubydebug }
elasticsearch
{
    hosts => ["https://URL:9243"]
    ssl => true
    user => "XXXX"
    password => "XXXX"
    index => "logstash-gcp-audit-%{+YYYY.MM.dd}"
  }
}

daniel_a · July 24, 2019, 5:04am

A super simple fix:

filter {
    split { field => "[protoPayload][authorizationInfo]" }
}

daniel_a · July 24, 2019, 5:06am

So let's say that I have two objects now in this array (your example above). How would this get extracted (split) using this method?

split { field => "[protoPayload][authorizationInfo]" }

system · August 7, 2019, 5:06am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Objects in arrays are not well supported Kibana	3	5895	April 6, 2018
Get "Objects in arrays are not well supported" with new schema in Kibana Kibana	2	2755	December 5, 2019
Kibana Field - Objects in Array are not well supported Kibana	4	515	July 23, 2020
Objects in array are not well supported Kibana	2	667	August 16, 2019
Best practice to denormalize array objects Kibana	4	1868	January 18, 2019

Fixing Mapping for Objects in Array (objects in arrays are not well supported)

Related topics