Help me writing grok filter for the pattern

I've http log of the form

10.255.255.255 - jira [11/Mar/2021:10:00:03 -0800] "GET /svn/repos/branches/feature-IPv6-TWLP-3.2/ZProxyHealthManager.cpp HTTP/1.1" 200 29110 

& my pattern is:

%{IPORHOST:client_ip} %{HTTPDUSER:ident} %{USER:username} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:http_method} %{NOTSPACE:svn_path}(?: HTTP/%{NUMBER:http_version})?|%{DATA:svn_path})\" %{NUMBER:http_response} (?:%{NUMBER:content_length}|-)

I need one more field which should be named as svn_branch which comes from svn_path after, /branches (eg here svn_branch is, /feature-IPv6-TWLP-3.2) please help me creating this new field called svn_branch.

grok {
match => {"svn_path" => "(.*)/branches%{GREEDYDATA:svn_brnch}"}
}

filter {
mutate {
add_field => { "svn_branch" => "%{svn_brnch}" }
}

HI @Mohammed_Anas. Thanks for replying!!

I didn't get the expected out as add_field is not working ig.
Here is my logstash pipeline

input {

    file {

        path => "C:/Users/Abhishek S/Desktop/logfiles/prod logs/httpd-access-new.log"
        start_position => "beginning"
        type => "apache-access"
        sincedb_path => "NUL"
    }
}

filter {

    if [type] == "apache-access" {

        grok {
            match => { "message" => [
                                        "%{IPORHOST:client_ip} %{HTTPDUSER:ident} %{USER:username} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:http_method} %{NOTSPACE:svn_path}(?: HTTP/%{NUMBER:http_version})?|%{DATA:svn_paths})\" %{NUMBER:http_response} (?:%{NUMBER:con_len}|-) \"-\" \"%{GREEDYDATA:user_agent}\"" ,
                                        "%{IPORHOST:client_ip} %{HTTPDUSER:ident} %{USER:username} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:http_method} %{NOTSPACE:svn_path}(?: HTTP/%{NUMBER:http_version})?|%{DATA:svn_paths})\" %{NUMBER:http_response} (?:%{NUMBER:con_len}|-)" 
                                    ]

                        
                    }

            match => { "svn_path" => "(.*)/branches/%{GREEDYDATA:svn_brnch}"
                     }
        }

        date {
            match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
        }

        mutate { 
            add_field => { "svn_branch" => "%{svn_brnch}" }
            remove_field => [ "http_version", "host", "path", "ident" ] 
            
            }
    }

}


output {

    elasticsearch {

        hosts => "http://localhost:9200"
        index => "svn.corp"
        
    }
    stdout { }
}   

And the output is:

{
         "@version" => "1",
         "username" => "jirasvn",
          "message" => "10.34.115.33 - jirasvn [11/Mar/2021:10:00:29 -0800] \"GET /svn/repos/!svn/rvr/271559/mobile/client/branches/release-3.4/apps/mac/system/ZSecureAgent/ZSecureAgent.xcodeproj/project.pbxproj HTTP/1.1\" 200 335628 \"-\" \"SVN/1.9.7 (x86_64-pc-linux-gnu) serf/1.3.9\"\r",
      "http_method" => "GET",
       "svn_branch" => "%{svn_brnch}",
        "client_ip" => "10.34.115.33",
    "http_response" => "200",
       "@timestamp" => 2021-03-11T18:00:29.000Z,
       "user_agent" => "SVN/1.9.7 (x86_64-pc-linux-gnu) serf/1.3.9",
          "con_len" => "335628",
        "timestamp" => "11/Mar/2021:10:00:29 -0800",
         "svn_path" => "/svn/repos/!svn/rvr/271559/mobile/client/branches/release-3.4/apps/mac/project.pbxproj",
             "type" => "apache-access"
}

Please help me out where I'm going wrong.

If you have multiple match options in a single grok filter they may not be evaluated in the order you expect. If the svn_path match is evaluated before that field is created then you will get the result you see. Split it into two grok filters.

Hi @Badger & @Mohammed_Anas ,

I got this done & its working!!
But, you can see there's huge time gap between @timestamp & timestamp which is annoying me, can you please help me out to maintain 0 time gap between them?
btw mytimezone is :

timezone => "Asia/Kolkata"

No, there is not. @timestamp is 18:00:29 in UTC, and timestamp is 10:00:29 in the timezone that is 8 hours behind UTC (that is what the -8000 means).

Thanks alot @Badger for on point explaination!!

But, what necessary changes I need to make so that both times will be in my timezone (viz., 8 hrs behind UTC)

@timestamp will always be in UTC, because elasticsearch always stores times as UTC. Kibana then shifts them to the browser's timezone.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.