Add new field using part of existing value

Feedy · May 27, 2020, 12:43am

Hello - I would like to create a new field named "process_name". I would like to use part of the an existing field's value to add to the newly created field. Ex:

Sample JSON Log :

"cb_server":"cbserver","computer_name":"xxxx-WA","direction":"outbound","domain":"","event_type":"netconn","local_ip":"::1","local_port":1234,"md5":"ASDFASDFASDFAS","pid":12345,"process_guid":"123412341234123405722f4","process_path":"c:\\users\\name\\appdata\\roaming\\createagent-1.1\\create_bridge.exe","protocol":1,"proxy":false,"remote_ip":"asdfasdf","remote_port":1234,"sensor_id":1234,"sha256":"ASDFASDF@#$!@#$%","timestamp":1589578181,"type":"ingress.event.netconn"

Is it possible to create a new field called "process_name" with just using "create_bridge.exe" value from the existing field "process_path"?

Logstash filter:

filter {
        if [log_type] == "netconn" {
                grok {
                        match => {
                                "message" => [ "%{GREEDYDATA:netconn_raw}" ]
                        }
                }
                json {
                        source => "netconn_raw"
                }
                mutate {
                        remove_field => [ "netconn_raw", "message", "timestamp" ]
                }
        }
}

Badger · May 27, 2020, 12:48pm

The grok filter makes no sense, why not just use

source => "message"

in the json filter? Also, instead of using a separate mutate you can add the remove_field option to the json filter. That way it only gets removed if it is parsed successfully.

You can use grok to extract the process path. I haven't tested it, but something like

grok { match => { "process_path" => "\\(?<process_name>[^\\]+)$" } }

You may need to play around with the number of backslashes in each place (2, 4, ..).

Feedy · May 27, 2020, 3:50pm

@Badger - Thanks for taking the time to help me with this. I took your suggestions and made the following changes to my filter and I am seeing some progress:

 filter {
        if [log_type] == "netconn" {
                json {
                        source => "message"
                        remove_field => [ "timestamp" ]
                }
                grok {
                        match => [ "process_path", "%{PATH:process_path}\\%{DATA:process_name}",
                                   "process_path", "%{UNIXPATH:process_path}/%{DATA:process_name}"
                        ]
                        remove_field => [ "message" ]
                }
        }
}

The logs are being parsed but the process_path field is showing two vaules:

How would I go about creating a new field titled process_name using just the svchost.exe portion?

Badger · May 27, 2020, 4:15pm

You have a [process_path] field and you are asking grok to add a [process_path] field, so you end up with an array. Add the following option

overwrite => [ "process_path" ]

What [process_name] field do the resulting events have?

Feedy · May 27, 2020, 4:27pm

@Badger - I still want to keep the field name "process_path" and its associated value but I would also like to create a new field named "process_name" but only use the last portion of the process_path value as the process_name value.

So essentially I would have the following fields:

process_path = c:\windows\system32\svchost.exe
process_name = svchost.exe

As of right now process_name is not being created.

Badger · May 27, 2020, 4:38pm

That's hard to believe unless you are getting a _grokparsefailure, but the doubling up of process_path tells you are not getting a parse failure.

Note that the UNIXPATH pattern can be very, very expensive. That is why I suggested anchoring to end-of-string and using a character group that excludes the directory separator. Try

grok {
    overwrite => [ "process_path" ]
    match => { "process_path" => [
        "%{DATA:process_path}\\(?<process_name>[^\\]+)$",
        "%{DATA:process_path}/(?<process_name>[^/]+)$",
        ]
    }
}

That should be cheaper.

Feedy · May 28, 2020, 4:45pm

@Badger - I am now seeing the new field "process_name" being created. I noticed that there is a change to the "process_path" field and the last portion is being dropped from its value. Not a huge deal but was just wondering if there was a way to keep that field to show the full path and also how do I account for events that do not have the "process_path" field and are resulting in grokparsefailure tags? The grokparsefailure isn't a huge issue because the logs are still being parsed but since they don't have the field process_path that's why its receiving that tag(in my humble opinion).

Latest Filter:

filter {
       if [log_type] == "netconn" {
               json {
                       source => "message"
                       remove_field => [ "timestamp" ]
               }
               grok {
                       overwrite => [ "process_path" ]
                       match => [ "process_path", "%{DATA:process_path}\\(?<process_name>[^\\]+)$",
                                  "process_path", "%{DATA:process_path}/(?<process_name>[^/]+)$"
                                ]
                       remove_field => [ "message" ]
               }
       }
}

Capture3

Badger · May 28, 2020, 5:07pm

If you do not want to modify process_path then do not overwrite it:

grok {
    match => { "process_path" => [
        "\\(?<process_name>[^\\]+)$",
        "/(?<process_name>[^/]+)$",
        ]
    }
}

Feedy · May 28, 2020, 5:19pm

@Badger - yeah I gave that a try before I replied but it lists an array for the value. I can live with the current results I am getting at this point. Thanks for all your help.

Feedy · May 28, 2020, 6:04pm

@Badger - I re-read your latest reply and realized I read it incorrectly. After making the change you recently suggested its working as should. thanks again. Final config listed below just incase anybody has the same issue:

filter {
        if [log_type] == "netconn" {
                json {
                        source => "message"
                        remove_field => [ "timestamp" ]
                }
                grok {
                        match => [ "process_path", "\\(?<process_name>[^\\]+)$",
                                   "process_path", "/(?<process_name>[^/]+)$"
                                 ]
                        remove_field => [ "message" ]
                }
        }
}

system · June 25, 2020, 6:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash Create new field from existing field and remove character from new field Logstash	5	410	July 26, 2021
How to create a new field using value from another field? Logstash	5	11268	May 6, 2019
I want to add new field and replace the one of the existing field value to the newly added field Logstash	1	283	September 13, 2019
Parsing an existing field further Logstash	6	389	September 10, 2021
Add a new field based on a regex capturing group Logstash	2	6510	July 6, 2017

Add new field using part of existing value

Related topics