Grab specific fields from multiline log


(David Mc Whinnie) #1

Looking for some guidance on how to pull some specific pieces of data out of a multiline log. Have been using a logstash filter, but it is taking too long, and I am hoping there is a better way.

The logfile:

2017-04-25[08:02:16(UTC+06:00)]:E:pillase: ******* S T A R T of Error message *******
2017-04-25[08:02:16(UTC+06:00)]:E:pillase: Log message called from /view/port9.1a.01.nlbaucc.ccadm.snapshot/vobs/tt/lib/al_1/al_fpath.c: #279 keyword: sopen
2017-04-25[08:02:16(UTC+06:00)]:E:pillase: Pid 32121 Uid 23707 Euid 23707 Gid 125 Egid 125 Pset pillase@10.10.10.1:60090
2017-04-25[08:02:16(UTC+06:00)]:E:pillase: user_type N language 2 user_name pillase locale ISO88591/NULL package_comb b61prod
2017-04-25[08:02:16(UTC+06:00)]:E:pillase: session: "ottstpbdeman ";object: "ottstpbdeman"; function: "tgcomlogging.write" seq.open; company number: 301
2017-04-25[08:02:16(UTC+06:00)]:E:pillase: Errno 13 (Permission denied) bdb_errno 0
2017-04-25[08:02:16(UTC+06:00)]:E:pillase: Log_mesg:
2017-04-25[08:02:16(UTC+06:00)]:E:pillase: Error during sopen(/ln/bse/log/log.ottst, /ln/bse/log/log.ottst)
2017-04-25[08:02:16(UTC+06:00)]:E:pillase: ******* E N D of Error message *******
2017-04-25[08:02:16(UTC+06:00)]:E:pillase:
2017-04-25[08:03:24(UTC+06:00)]:E:pillase: ******* S T A R T of Error message *******
2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Log message called from /view/port9.1a.01.nlbaucc.ccadm.snapshot/vobs/tt/lib/al_1/al_fpath.c: #279 keyword: sopen
2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Pid 32121 Uid 23707 Euid 23707 Gid 125 Egid 125 Pset pillase@10.10.10.1:60090
2017-04-25[08:03:24(UTC+06:00)]:E:pillase: user_type N language 2 user_name pillase locale ISO88591/NULL package_comb b61prod
2017-04-25[08:03:24(UTC+06:00)]:E:pillase: session: "ottstpbdeman ";object: "ottstpbdeman"; function: "tgcomlogging.write" seq.open; company number: 301
2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Errno 13 (Permission denied) bdb_errno 0
2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Log_mesg:
2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Error during sopen(/ln/bse/log/log.ottst, /ln/bse/log/log.ottst)
2017-04-25[08:03:24(UTC+06:00)]:E:pillase: ******* E N D of Error message *******
2017-04-25[08:03:24(UTC+06:00)]:E:pillase:

My logstash multiline codec is working for me:

codec => multiline {
  pattern => "%{YEAR}-%{MONTHNUM}-%{MONTHDAY}\[%{HOUR}:%{MINUTE}:%{SECOND}\(UTC%{ISO8601_TIMEZONE}\)\]:E:%{USERNAME}:[[:space:]]\*\*\*\*\*\*\* S"
  negate => true
  what => previous
}

My Grok filter is working until I get too deep into the log data, and it starts timing out.

match =>{"message" => "(?[0-9]{4}-[0-9]{2}-[0-9]{2})[(?[0-9]{2}:[0-9]{2}:[0-9]{2})(UTC%{ISO8601_TIMEZONE:timezone})]:E:%{USERNAME:username}:[[:space:]]******* S T A R T of Error message *******\n%{YEAR}-%{MONTHNUM}-%{MONTHDAY}[%{HOUR}:%{MINUTE}:%{SECOND}(UTC%{ISO8601_TIMEZONE})]:E:%{USERNAME}: Log message called from %{PATH:program}: #%{NUMBER} keyword: %{WORD:keyword}\n%{YEAR}-%{MONTHNUM}-%{MONTHDAY}[%{HOUR}:%{MINUTE}:%{SECOND}(UTC%{ISO8601_TIMEZONE})]:E:%{USERNAME}: Pid %{NUMBER:pid} Uid %{NUMBER:uid} Euid %{NUMBER:euid} Gid %{NUMBER:gid} Egid %{NUMBER:egid} Pset %{USERNAME}@%{IPORHOST:hostname}:%{NUMBER}\n%{DATA:data}\n%{DATA:skip2} "}

Gets me this output:

> {
>           "date" => "2017-04-25",
>            "gid" => "125",
>           "euid" => "23707",
>           "data" => "2017-04-25[08:03:24(UTC+06:00)]:E:pillase: user_type N language 2 user_name pillase locale ISO88591/NULL package_comb b61prod",
>       "timezone" => "+06:00",
>            "pid" => "32121",
>        "program" => "/view/port9.1a.01.nlbaucc.ccadm.snapshot/vobs/tt/lib/al_1/al_fpath.c",
>        "message" => "2017-04-25[08:03:24(UTC+06:00)]:E:pillase: ******* S T A R T of Error message *******\n2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Log message called from /view/port9.1a.01.nlbaucc.ccadm.snapshot/vobs/tt/lib/al_1/al_fpath.c: #279 keyword: sopen\n2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Pid 32121 Uid 23707 Euid 23707 Gid 125 Egid 125 Pset pillase@10.10.10.1:60090\n2017-04-25[08:03:24(UTC+06:00)]:E:pillase: user_type N language 2 user_name pillase locale ISO88591/NULL package_comb b61prod\n2017-04-25[08:03:24(UTC+06:00)]:E:pillase: session: \"ottstpbdeman \";object: \"ottstpbdeman\"; function: \"tgcomlogging.write\" seq.open; company number: 301\n2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Errno 13 (Permission denied) bdb_errno 0\n2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Log_mesg:\n2017-04-25[08:03:24(UTC+06:00)]:E:pillase: Error during sopen(/ln/bse/log/log.ottst, /ln/bse/log/log.ottst)\n2017-04-25[08:03:24(UTC+06:00)]:E:pillase: ******* E N D of Error message *******\n2017-04-25[08:03:24(UTC+06:00)]:E:pillase:",
>           "tags" => [
>         [0] "multiline"
>     ],
>           "path" => "/tmp/baan3.log",
>            "uid" => "23707",
>           "egid" => "125",
>       "hostname" => "10.10.10.1",
>     "@timestamp" => 2017-04-26T19:42:41.493Z,
>          "skip2" => "2017-04-25[08:03:24(UTC+06:00)]:E:pillase:",
>       "@version" => "1",
>           "host" => "corp-vb1-rh19",
>           "time" => "08:03:24",
>        "keyword" => "sopen",
>       "username" => "pillase"
> }

As I move further down the message trying to get session, object, errno etc. It is really slow or just times out.

Is there a better way to do this other than creating one big grok filter?


(Pablo) #2

Do you have a way to change the log message from origin?
Probably that would be the optimal solution.


(David Mc Whinnie) #3

This is a third party app, and I can't change the log format.


(David Mc Whinnie) #4

I found using several mutate statements I can remove the repeating fields, remove the long start and end strings, and convert the fields i need into KV pairs, and then use the KV filter.

This all runs very fast rather than parse using DATA fields.

Use these mutate entries to trim and KV the data

>   #Remove the new lines
>   mutate {gsub => ["message","\n"," "]}
>   #Remove duplicated date, time, timezone and username for ease and speed later
>   mutate {gsub => ["message","%{date}\[%{time}\(UTC\+%{timezone}\)\]:E:%{username}:",""]}
>   # Remove the S T A R T message
>   mutate {gsub => ["message","\*\*\*\*\*\*\* S T A R T of Error message \*\*\*\*\*\*\*",""]}
>   # Remove the : from the library line number
>   mutate {gsub => ["message",": #","#"]}
>   #Shorten and fix library key name
>   mutate {gsub => ["message","Log message called from ","library="]}
>   # shorten and fix company number key
>   mutate {gsub => ["message"," company number:"," company="]}
>   # convert colon to = for key value pairing
>   mutate {gsub => ["message",": ","="]}
>   # convert the ids (pid, guid, euid, egid, etc.) to key values
>   mutate {gsub => ["message","id ","id="]}
>   # convert pset to key value
>   mutate {gsub => ["message"," Pset "," process="]}
>   # convert Errno to key value and add quote
>   mutate {gsub => ["message"," Errno ",' errno="']}
>   # convert bdb_errono to key and add previous key quote
>   mutate {gsub => ["message"," bdb_errno ",' " bdb_errno="']}
>   # convert log_mseg to key value and add previous key quote
>   mutate {gsub => ["message","  Log_mesg=",'" Log_mesg="']}
>   #mutate {gsub => ["message"," Log_mesg:",'" log_mesg="']}
>   # Remove the E N D message
>   mutate {gsub => ["message","\*\*\*\*\*\*\* E N D of Error message \*\*\*\*\*\*\*",'"']}

This leaves me with a shorter message and KV pairs like this:

message" => " library=/view/port9.1a.01.nlbaucc.ccadm.snapshot/vobs/tt/lib/al_1/al_fpath.c#293 keyword=sopen Pid=2074 Uid=20151382 Euid=20151382 Gid=4 Egid=4 process=satabi@10.10.10.1:11872 user_type S language 2 user_name satabi locale / package_comb b61prod session="ttdskbbrowser";object="ttdskbbrowser"; function="unknown" ; company= 900 errno="2 (No such file or directory) " bdb_errno="513 (Data dictionary not found)" Log_mesg=" No definition in definition file for sopen(F_OBJ:ttdskbbrowser, ttdskbbrowser) " "

Then using KV filter I get this.

      "date" => "2017-04-24",
  "timezone" => "+06:00",
   "session" => "ttdskbbrowser",
      "Euid" => "20151382",
      "path" => "/tmp/baan3.log",
   "library" => "/view/port9.1a.01.nlbaucc.ccadm.snapshot/vobs/tt/lib/al_1/al_fpath.c#293",
  "function" => "unknown",
  "@version" => "1",
      "host" => "hostname",
   "company" => "900",
   "keyword" => "sopen",
   "process" => "satabi@10.10.10.1",
     "errno" => "2 (No such file or directory) ",
       "Gid" => "4",
       "Pid" => "2074",
   ";object" => "ttdskbbrowser",
       "Uid" => "20151382",
"@timestamp" => 2017-04-27T20:20:46.600Z,
  "Log_mesg" => " No definition in definition file for sopen(F_OBJ:ttdskbbrowser, ttdskbbrowser)  ",
      "time" => "17:29:57",
      "Egid" => "4",
  "username" => "satabi"

}


(Pablo) #5

Ha, brilliant, looks quite complicated that chunk of mutates, but if it works great, that's perfect!


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.