Grok Failure with Logstash 6.0


(holiday-sunrise@gmx.de) #1

hi i´m trying to filter MSReportserver logs. But i always get an "Grokfalure"

library!WindowsService_20!37d8!11/06/2017-11:00:13:: i INFO: Cleaned 0 batch records, 0 policies, 0 sessions, 0 cache entries, 0 snapshots, 0 chunks, 0 running jobs, 0 persisted streams, 0 segments, 0 segment mappings, 0 edit sessions.

grok {
#
match => ["message", "%{DATA:application}!%{DATA:service}!%{DATA:id}!%{DATA:log_timestamp}:: %{DATA:level_info} %{DATA:level_log}: %{GREEDYDATA:msg}"]
}

There is a Problem with the "::" Pattern ???!???!

Trying the Pattern in the Debugger https://grokdebug.herokuapp.com/

It Works fine.

Whats wrong with my config. :frowning:


(Magnus Bäck) #2

Not sure what's up here, but

  • replace DATA with more exact patterns and
  • start the expression with ^ since you only want to match at the start of the string, and
  • build the expression gradually. Start with something that only matches "library!" and build from there.

(holiday-sunrise@gmx.de) #3

hi Magnus thx for your reply.

with the following pattern i got no grokfailure, but the message is not fully parsed-

library!WindowsService_20!37d8!11/06/2017-11:00:13:: i INFO: Cleaned 0 batch records, 0 policies, 0 sessions, 0 cache entries, 0 snapshots, 0 chunks, 0 running jobs, 0 persisted streams, 0 segments, 0 segment mappings, 0 edit sessions.

grok {

match => ["message", "%{DATA:application}!%{DATA:service}!%{DATA:id}!%{DATA:log_timestamp} %{GREEDYDATA:msg}"]
}

the result is:

application = library
service = WindowsService_20
id = 37d8
log_timestamp = 11/06/2017-11:00:13::
msg = i INFO: Cleaned 0 batch records, 0 policies, 0 sessions, 0 cache entries, 0 snapshots, 0 chunks, 0 running jobs, 0 persisted streams, 0 segments, 0 segment mappings, 0 edit sessions.

if i try to parse the rest of the message it fails.

What did you mean more exakt pattern ? The first three atributes could be found/splitted with an exclamation mark "!"
the rest is splitted be "whitespaces and colons"

Can you send me an more exakt pattern.

Thx

Rainer


(Magnus Bäck) #4

with the following pattern i got no grokfailure, but the message is not fully parsed-
...
if i try to parse the rest of the message it fails.

Works for me.

$ cat test.config 
input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
  grok {
    match => [
      "message",
      "%{DATA:application}!%{DATA:service}!%{DATA:id}!%{DATA:log_timestamp}:: %{DATA:level_info} %{DATA:level_log}: %{GREEDYDATA:msg}"
    ]
  }
}
$ cat data 
library!WindowsService_20!37d8!11/06/2017-11:00:13:: i INFO: Cleaned 0 batch records, 0 policies, 0 sessions, 0 cache entries, 0 snapshots, 0 chunks, 0 running jobs, 0 persisted streams, 0 segments, 0 segment mappings, 0 edit sessions.
$ /opt/logstash/bin/logstash -f test.config < data
Settings: Default pipeline workers: 8
Pipeline main started
{
          "message" => "library!WindowsService_20!37d8!11/06/2017-11:00:13:: i INFO: Cleaned 0 batch records, 0 policies, 0 sessions, 0 cache entries, 0 snapshots, 0 chunks, 0 running jobs, 0 persisted streams, 0 segments, 0 segment mappings, 0 edit sessions.",
         "@version" => "1",
       "@timestamp" => "2017-11-22T08:46:14.353Z",
             "host" => "lnxolofon",
      "application" => "library",
          "service" => "WindowsService_20",
               "id" => "37d8",
    "log_timestamp" => "11/06/2017-11:00:13",
       "level_info" => "i",
        "level_log" => "INFO",
              "msg" => "Cleaned 0 batch records, 0 policies, 0 sessions, 0 cache entries, 0 snapshots, 0 chunks, 0 running jobs, 0 persisted streams, 0 segments, 0 segment mappings, 0 edit sessions."
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}

What did you mean more exakt pattern ?

The DATA and GREEDYDATA patterns match any characters, including your delimiter characters. Under some circumstances this can result in incorrect matches. In this case you could use (?<application>[^!]+) to matches everything up to (but not including) the next exclamation point. This avoids incorrect matches.

Secondly, excessive use of DATA and GREEDYDATA can result in extremely bad performance since the input string can be parsed in multiple ways and the regexp engine needs to go over the whole string multiple times. For the same reason you should start your expression with ^.


(holiday-sunrise@gmx.de) #5

Your example works not for me. :frowning:

i have found the problem that could not parsed

11/18/2017-22:41:02::

Whats wrong


(Magnus Bäck) #6

Which version of Logstash?


(holiday-sunrise@gmx.de) #7

logstash-6.0.0 on an Windows 2012r2 Server (monitor server)

ELK 6.0.0 an Windows 2012r2 Server

Windows 2012r2 Server (test server) filebeat.6.0.0 is sending to monitoring


(Magnus Bäck) #8

Works fine with Logstash 6 on Linux. Could you copy/paste the commands you run in a command prompt to reproduce the problem, similar to what I did above?


(holiday-sunrise@gmx.de) #9

soory for my late reply...

We did not solve the problem. But we are one step further. The file code is "UCS-2BE". if we use this, the grok filter works. but if we try to make a date out of the field "log_timestamp" we get another grokfailure

enclosed my config

input {

  beats {
	type => "httpreportserver2"
    port => 5047
    host => "0.0.0.0"
	codec => plain{ charset => "UCS-2BE" }
  }

}


filter {
		if [type] == "httpreportserver2" {
			# ignore log comments
			if [message] =~ "^#" {
				drop {}
			}

			grok {
			#  %{DATESTAMP:bla} %{NOTSPACE:serverip} %{WORD:csmethod} %{URIPATH:uriquery} (%{NOTSPACE:queryparam})? %{NUMBER:sport} %{NOTSPACE:csusername} %{NOTSPACE:clientip} %{NUMBER:protocolstatus} %{NUMBER:bytesreceived} %{NUMBER:timetaken}
				match => ["message", "%{DATESTAMP:log_timestamp} %{IP:serverip} %{WORD:csmethod} %{URIPATH:uriquery} (%{NOTSPACE:queryparam})? %{NUMBER:sport} %{NOTSPACE:csusername} %{IP:clientip} %{NUMBER:protocolstatus} %{NUMBER:bytesreceived} %{NUMBER:timetaken}"]
			}

			
			date {
				match => [ "peng", "MM/dd/yyyy HH:mm:ss" ]
				target => "@timestamp"
			}


			mutate {
				#remove_field => ["log_timestamp"]
				convert => ["timetaken", "integer"]
			}

		}
}

message is now ㄀㈀⼀㄀㐀⼀㈀ ㄀㜀 ㈀㄀㨀 㔀㨀㌀㔀 㨀㨀㄀ 倀伀匀吀 ⼀刀攀瀀漀爀琀匀攀爀瘀攀爀⼀刀攀瀀漀爀琀䔀砀攀挀甀琀椀漀渀㈀ 㔀⸀愀猀洀砀 ㈀ 㜀㌀㘀 ⴀ 㨀㨀㄀ 㐀 ㄀ ㌀㠀㈀ ഀ�

whats wrong


(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.