Unable to parse JSON Array object in Logstash

I am trying to parse an example JSON Array into Logstash as follow:

{
    "Records": [
    {
        "eventVersion": "1.03",
        "userIdentity": {
            "type": "IAMUser",
            "principalId": "111122223333",
            "arn": "arn:aws:iam::111122223333:user/myUserName",
            "accountId": "111122223333",
            "accessKeyId": "AKIAIOSFODNN7EXAMPLE",
            "userName": "myUserName"
        },
        "eventTime": "2015-08-26T20:46:31Z",
        "eventSource": "s3.amazonaws.com",
        "eventName": "DeleteBucketPolicy",
        "awsRegion": "us-west-2",
        "sourceIPAddress": "127.0.0.1",
        "userAgent": "[]",
        "requestParameters": {
            "bucketName": "myawsbucket"
        },
        "responseElements": null,
        "requestID": "47B8E8D397DCE7A6",
        "eventID": "cdc4b7ed-e171-4cef-975a-ad829d4123e8",
        "eventType": "AwsApiCall",
        "recipientAccountId": "111122223333"
    },
    {
       "eventVersion": "1.03",
       "userIdentity": {
            "type": "IAMUser",
            "principalId": "111122223333",
            "arn": "arn:aws:iam::111122223333:user/myUserName",
            "accountId": "111122223333",
            "accessKeyId": "AKIAIOSFODNN7EXAMPLE",
            "userName": "myUserName"
        },
      "eventTime": "2015-08-26T20:46:31Z",
      "eventSource": "s3.amazonaws.com",
      "eventName": "PutBucketAcl",
      "awsRegion": "us-west-2",
      "sourceIPAddress": "",
      "userAgent": "[]",
      "requestParameters": {
          "bucketName": "",
          "AccessControlPolicy": {
              "AccessControlList": {
                  "Grant": {
                      "Grantee": {
                          "xsi:type": "CanonicalUser",
                          "xmlns:xsi": "http://www.w3.org/2001/XMLSchema-instance",
                          "ID": "d25639fbe9c19cd30a4c0f43fbf00e2d3f96400a9aa8dabfbbebe1906Example"
                       },
                      "Permission": "FULL_CONTROL"
                   }
              },
              "xmlns": "http://s3.amazonaws.com/doc/2006-03-01/",
              "Owner": {
                  "ID": "d25639fbe9c19cd30a4c0f43fbf00e2d3f96400a9aa8dabfbbebe1906Example"
              }
          }
      },
      "responseElements": null,
      "requestID": "BD8798EACDD16751",
      "eventID": "607b9532-1423-41c7-b048-ec2641693c47",
      "eventType": "AwsApiCall",
      "recipientAccountId": "111122223333"
    },
    {
      "eventVersion": "1.03",
      "userIdentity": {
          "type": "IAMUser",
          "principalId": "111122223333",
          "arn": "arn:aws:iam::111122223333:user/myUserName",
          "accountId": "111122223333",
          "accessKeyId": "AKIAIOSFODNN7EXAMPLE",
          "userName": "myUserName"
        },
      "eventTime": "2015-08-26T20:46:31Z",
      "eventSource": "s3.amazonaws.com",
      "eventName": "GetBucketVersioning",
      "awsRegion": "us-west-2",
      "sourceIPAddress": "",
      "userAgent": "[]",
      "requestParameters": {
          "bucketName": "myawsbucket"
      },
      "responseElements": null,
      "requestID": "07D681279BD94AED",
      "eventID": "f2b287f3-0df1-4961-a2f4-c4bdfed47657",
      "eventType": "AwsApiCall",
      "recipientAccountId": "111122223333"
    }
  ]
}

My Logstash'S configuration file is as follow:

input{
       file{
       path=>"/home/xenial/Documents/cloudtrail2.json"
       start_position=>"beginning"
       codec=>"json"
       }


}

filter{

        json{
                source => "messages"
                }
        split{
                field => "Records"
        }


}


output {
  elasticsearch { hosts => ["localhost:9200"]
    hosts => "localhost:9200"
    manage_template => false
#           index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
#    document_type => "%{[@metadata][type]}"
  }
}

But this returns error message [2017-06-13T19:15:29,479][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:Records is of type = NilClass

Is my configuration file correct?

With that config, you are parsing each line of JSON separately, which does not work. You need a multiline code on the input. Something like

input{
        stdin{
                codec => multiline {
                        pattern => "^}"
                        negate => true
                        what => next
                        max_lines => 20000
                }
        }
}

Then to parse it the following appears to do what you want

filter{
        mutate { gsub => ["message", "\n", ""] }
        json{ source => "message" }
        split{ field => "Records" }
}

Lastly I would say that debugging this is much easier with a

output{
        stdout{ codec=>"rubydebug" }
}

Personally I would add a filter to delete the message field provided the json parse works (i.e. no parse failure tag)

    if "_jsonparsefailure" not in [tags] { mutate { remove_field => "message" }  }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.