Multiple grok parsing not extracting all the fields

Trying to extract some fields from the msgbody field using grok , but only the first field in the gork gets extracted.

interested Fields - corId, controller, httpStatusText and uri

Sample Data -

2020-01-03 10:44:17,025 [93] ERROR MedServFileLogger corId=cf25b00d-1e37-4eb7-ab75-82ceeec7fdab - Exception controller= Loan action= Getmethod= GET uri= http://xxxxxxxxxx/v2/media/instance/xxxx/loans/cdb79433-32fa-4df8-b73a-e87aa89f2007/files/images-178ee8d0-fa48-4b9f-a8df-abcc9cfb1ac7.zip/entries/0b3e99f8-8af8-49a5-95b1-1537c715eb43.png?tokencreator=Encompass&tokenexpires=1578076775&token=pLCvT%2F1pBPhuFXiHKDIlB5F9feocqeq7Wxx%2FyhAz7B6DCcKeOP3YjO%2FnalfjTgXdieAmyFHEiW72Soym14oBuw%3D%3D
System.UnauthorizedAccessException: MediaTokenInvalid - A valid Token must be provided for accessing Media

2020-01-03 03:58:12,822 [37] ERROR MedServFileLogger corId=5aa9b90b-9fe6-4700-aa8f-c08be2d3f0ea - Returning controller= Health action= Getmethod= GET uri= http://localhost/v2/media/healthhttpStatusCode=503 httpStatusText=ServiceUnavailable

Logstash Filter -

filter {

if [project] == "media_server"
{
grok {
match => [ "message", "(?m)%{TIMESTAMP_ISO8601:logtime} [(?[\d.]+)] +%{LOGLEVEL:loglevel} %{GREEDYDATA:msgbody}" ]
}

     grok {
            match => {
            break_on_match => "false"
            "msgbody" => [ "corId=%{UUID:corId}", "controller=%{SPACE}%{WORD:controller}", "httpStatusText=%{WORD:httpStatusText}", "uri=%{SPACE}%{URI:uri}" ]
            }
    }

       date {
        locale => "en"
        match => ["logtime", "YYYY-MM-dd HH:mm:ss,SSS"]
        timezone => "America/Los_Angeles"
        target => "@timestamp"

    }
    mutate
    {
            remove_field => [ "msgbody" ]
    }

}
}

Using the above configuration, only the corId field is getting extracted and all other fields are dropped. I don't see any parsing errors/failures in the logstash logs.

Appreciate any help or guidance with this.

Thanks

I have tested below sample data,

This grok filter works for me,

(?m)%{TIMESTAMP_ISO8601:logtime}%{SPACE}\[%{INT:num}\]%{SPACE}%{LOGLEVEL:log_level}%{SPACE}%{WORD:logger}%{SPACE}corId=(?<corId>[A-Za-z0-9-]+)%{SPACE}-%{SPACE}%{WORD:field1}%{SPACE}controller=%{SPACE}%{WORD:Controller}%{SPACE}action=%{SPACE}Getmethod=%{SPACE}%{WORD:get_method}%{SPACE}uri=%{SPACE}%{URI:url}%{SPACE}httpStatusText=%{WORD:http_Status}

image

There is a grok debugger available with kibana. you can use your sample data there and build grok filter that matches.

Thanks @mancharagopan,

The log message is not consistent, if you see the first example, there's stack trace followed after the "url" field and no"httpStatusText". Similarly, some of the logs do not contain the "httpStatusText" field at all.

Hence, I am trying to have everything in the "msgbody" field, followed by using multiple match to extract only other required fields from the body.

https://www.elastic.co/guide/en/logstash/5.5/plugins-filters-grok.html#plugins-filters-grok-match

@nmoham
You can use %{GREEDYDATA:msgbody} after %{URI:url}%{SPACE} to have all the trails into msgbody field then you can use if conditions to match data inside msgbody to apply different patterns.

hi @mancharagopan,

got it working , by including the following "patterns_dir" and also "break_on_match" inside the grok filter and before the "match" stanza.

            patterns_dir => "/etc/logstash/patterns"
            break_on_match => "false"

Working Filter -

filter {

   if [project] == "media_server"
        {
        grok {
            match => [ "message", "(?m)%{TIMESTAMP_ISO8601:logtime} \[(?<threadid>[\d.]+)\] +%{LOGLEVEL:loglevel} %{GREEDYDATA:msgbody}" ]
        }

         grok {
                patterns_dir => "/etc/logstash/patterns"
                break_on_match => "false"
                match => {
                "msgbody" => [ "corId=%{UUID:corId}", "controller=%{SPACE}%{WORD:controller}", "httpStatusText=%{WORD:httpStatusText}", "uri=%{SPACE}%{URI:uri}" ]
                }
        }

           date {
            locale => "en"
            match => ["logtime", "YYYY-MM-dd HH:mm:ss,SSS"]
            timezone => "America/Los_Angeles"
            target => "@timestamp"

        }
        mutate
        {
                remove_field => [ "msgbody" ]
        }
  }
}

Thank you so much for your valuable input.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.