Grok pattern Help


(DDA) #1

Hello,

I would need some help in generating a Grok pattern. I have the following log file. The lines alternate between a line with a timestamp and a message and another line with the log level and another message:

Aug 25, 2015 12:00:02 AM This is some log messages.
DEBUG: Checking context[/webdav] redeploy resource /home/genesys/gcti/BOEXI/bobje/tomcat/webapps/webdav
Aug 25, 2015 12:00:02 AM This is some log messages.
DEBUG: Checking context[/webdav] redeploy resource /home/genesys/gcti/BOEXI/bobje/tomcat/webapps/webdav/META-INF/context.xml

  1. I am having problems in parsing This is some log messages. in one continuous data. I would like to simply store this information as "Message". So far, I've been able to parse the date as such:

%{MONTH} %{MONTHDAY}, 20%{YEAR} %{HOUR}::?%{MINUTE}(?::?%{SECOND}) (?:AM|am|PM|pm)

I'm having issues in parsing the remainder of the line

  1. How would I be handling the alternating types of log lines? I can understand if all the lines in the log file is consistent.

Thanks for your assistance.

Regards


(Magnus Bäck) #2

Use a multiline codec or filter to join the second line of each message with the line that begins with the timestamp. The multiline logic would be "if the current line doesn't begin with a timestamp, join with the previous line".

Other comments:

  • Use %{YEAR}, not 20%{YEAR}.
  • You can use %{TIME} instead or %{HOUR}::?%{MINUTE}(?::?%{SECOND}).
  • You're going to want to capture the AM/PM into a field. Actually, you're going to want to capture all of the timestamp into a field so that you can pass it to a date filter. You can do that with (?<timestamp>%{MONTH} ... (?:AM|am|PM|pm)).

(DDA) #3

Hi Magnus,

Thanks for your assistance. We are having issues with the pattern part of the multiline codec. Is the syntax correct for "pattern" and "match"?

input {
file {
path => "C:\Users\Admin\Desktop\Logs_elasticsearch\catalina.2015-08-25_test"
type => "Catalina"
codec => multiline {
pattern => "^%{timestamp} "
negate => true
what => previous
}
}

filter {
	grok {
	 match => {"timestamp" => %{MONTH} %{MONTHDAY}, %{YEAR} %{TIME} (?:AM|am|PM|pm)}
	}
}

As a reminder, the lines of the logs look as follows:

Aug 25, 2015 12:00:02 AM This is some log messages.
DEBUG: Checking context[/webdav] redeploy resource /home/genesys/gcti/BOEXI/bobje/tomcat/webapps/webdav

Thanks for your assistance


(Magnus Bäck) #4

The multiline pattern is basically a grok pattern (i.e. a regular expression with support for grok patterns) so try this:

pattern => "^%{MONTH} %{MONTHDAY}, %{YEAR} %{TIME}"

Your grok filter has a couple of problems. On the syntax side it's missing double quotes around the grok expression itself, but the major problem is that it's backwards. The filter

grok {
  match => { "fieldname" => "expression" }
}

means "match expression against the contents of the field named fieldname". The field you should be matching against is message. Your expression should, as previously mentioned, contain

(?<timestamp>%{MONTH} %{MONTHDAY}, %{YEAR} %{TIME} (?:AM|am|PM|pm))

to extract a field named timestamp that contains the timestamp part of the message.


(system) #5