Not breaking down Embeded json field and instead keep it in a single field

stillfreem · November 22, 2022, 3:05am

Hi All,

I use json filter plugin to parse log entries in json format such as the below

{"actor_ip":"xxx","from":"Api::ActionsRunnerRegistration#POST","actor":"xxx","actor_id":2480,"org":"xxx","org_id":13,"action":"org.remove_self_hosted_runner","created_at":1669056434332,"data":{"user_agent":"GitHubActionsRunner-linux-x64/2.299.1 ClientId/3xxx RunnerId/229517 GroupId/2 CommitSHA/xxx","controller":"Api::ActionsRunnerRegistration","request_id":"d33b7a32-a424-46eb-82c2-232b30eff9f4","request_method":"post","request_category":"api","server_id":"10c1e833-0d38-4808-a0b2-7df5c87fac59","version":"v3","auth":"integration_installation","current_user":"xxx","integration_id":240,"installation_id":539,"_document_id":"SgNDkJsRSlmSjOqPkFv-2A","@timestamp":1669056434332,"operation_type":"remove","category_type":"Resource Management","business":"xxx","business_id":1,"actor_location":{"country_code":"US","country_name":"United States","location":{"lat":37.751,"lon":-97.822}}}}

The problem is with the data field that actually contains embedded json and as result once I get this to a SIEM it creates there new fields such as data_user_agent, data_created_at, etc..

The problem is that this is a github audit log and quite a lot of different data could be there which resulted in more than 500 data_something fields in the table in my SIEM which exceeded the threshold.

Using the stdout filter the below is a single entry that shows how the logs are parsed

{
          "from" => "xxx",
        "org_id" => 1386,
    "created_at" => 1669008758000,
          "tags" => [
        [0] "json",
        [1] "01fixed"
    ],
          "data" => {
                               "permissions" => {
            "metadata" => "read",
            "contents" => "write"
        },
                                "controller" => "Api::Integrations",
                           "aqueduct_job_id" => "xxx",
        "parent_integration_installation_id" => 481,
                            "operation_type" => "modify",
                                      "auth" => "xxx",
                          "token_last_eight" => "xxx",
                                "expires_at" => "xxx",
                      "repository_selection" => "selected",
                            "request_method" => "post",
                                       "job" => "ScopedIntegrationInstallableExpirationExtensionJob",
                            "repository_ids" => [
            [0] 8848
        ],
                            "integration_id" => 209,
                               "integration" => "tca-read-write-content",
                                 "server_id" => "62079e0a-aa16-47a9-a830-0782295a1b69",
                                "request_id" => "80355149-ce81-485c-b440-771027bbb5f3",
                              "_document_id" => "s6pSScziiGIU5qPIeiCUDg",
        "scoped_integration_installation_id" => 642448,
                                   "version" => "v3",
           "scoped_integration_installation" => "scoped_integration_installation-642448",
                            "actor_location" => {
             "postal_code" => "xxx",
                    "city" => "xxx",
            "country_code" => "xxx",
                "location" => {
                "lat" => 50.1162,
                "lon" => 8.6365
            },
            "country_name" => "xxx",
                  "region" => "xxx",
             "region_name" => "xxx"
        },
                                "user_agent" => "python-requests/2.26.0",
                             "active_job_id" => "9f018dd3-80ed-4529-8464-58189832d9cd",
                               "business_id" => 1,
                                  "business" => "xxx",
                             "category_type" => "Other",
                          "request_category" => "api",
                                "@timestamp" => 1669085347156
    },
           "org" => "xxx",
        "action" => "scoped_integration_installation.extend_expires_at",
      "actor_ip" => "xxx",
     "EventTime" => "2022-11-22T01:49:07.000Z"
}

My question is can I somehow tell Logstash not to break down an embedded json but instead gives me one fields called data that I can later convert to type dynamic in my cloud SIEM and parse it on search time, that way I wont exceed this 500 columns threslhold?

leandrojmp · November 22, 2022, 3:59am

Since the data field is a key in the source json, logstash will always parse it.

What you can do is add the content of the data field to another field using mutate and then remove the data field.

    mutate {
        add_field => {
            "[fieldName]" => "%{[data]}"
        }
    }

After that, the content of the fieldName will be:

{"user_agent":"GitHubActionsRunner-linux-x64/2.299.1 ClientId/3xxx RunnerId/229517 GroupId/2 CommitSHA/xxx","controller":"Api::ActionsRunnerRegistration","request_id":"d33b7a32-a424-46eb-82c2-232b30eff9f4","request_method":"post","request_category":"api","server_id":"10c1e833-0d38-4808-a0b2-7df5c87fac59","version":"v3","auth":"integration_installation","current_user":"xxx","integration_id":240,"installation_id":539,"_document_id":"SgNDkJsRSlmSjOqPkFv-2A","@timestamp":1669056434332,"operation_type":"remove","category_type":"Resource Management","business":"xxx","business_id":1,"actor_location":{"country_code":"US","country_name":"United States","location":{"lat":37.751,"lon":-97.822}}}

stillfreem · November 22, 2022, 1:42pm

Many thanks @leandrojmp, exactly what I needed

system · December 20, 2022, 1:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to parse mix json logs Logstash	28	4563	March 25, 2019
Split row data to fields Logstash	7	2140	July 6, 2017
Adding JSON fields Kibana Logstash	3	1668	July 6, 2017
Filter embedded JSON Logstash	2	808	July 6, 2017
Parse Json data wrapped in "message" into event fields Logstash	6	10872	July 6, 2017

Not breaking down Embeded json field and instead keep it in a single field

Related topics