Grok Patterns not execute properly on logstash filter

tharu85 · October 2, 2017, 9:11am

I have following log which has to be normalized using grok patterns.

<46>Sep 29 12:10:36 SXX-XX-XO SFIMS: [133:51:1] dcerpc2: SMB - Outstanding requests with the same MID [Impact: Currently Not Vulnerable] From \"XXX-XO-02\" at Fri Sep 29 12:10:34 2017 UTC [Classification: Potentially Bad Traffic] [Priority: 2] {tcp} 192.168.1.88:55422 (unknown)->172.2.2.1:445 (unknown)

I wrote following grok to filter above log in logstash

filter {

grok {
	match => [ "message", "<%{POSINT:pri_id}>%{SYSLOGTIMESTAMP:log_timestamp} %{HOSTNAME:hostname} %{WORD:source}: \[%{DATA:num}\] %{GREEDYDATA:signature} \[Impact: %{DATA:impact}\] From \\"%{DATA:device}\\" %{WORD:seq} %{WORD:day} %{SYSLOGTIMESTAMP:trigger_timestamp} %{DATA:list_year} %{WORD:time_zone} \[Classification: %{GREEDYDATA:classification}\] \[Priority: %{NUMBER:priority}\] \{%{DATA:protocol}\} (?<srcip>[0-9]+.[0-9]+.[0-9]+.[0-9]+|N/A):(?<srcport>[0-9]+|N/A) \(%{DATA:srcname}\)->(?<dstip>[0-9]+.[0-9]+.[0-9]+.[0-9]+|N/A):(?<dstport>[0-9]+|N/A) \(%{DATA:dstname}\)"]
}	

mutate {
	remove_field => [ "pri_id", "num", "seq", "day", "time_zone, "list_year" ]
  }
}

But when I run this .conf file it return following error

Sending Logstash's logs to /etc/logstash-5.2.2/logs which is now configured via log4j2.properties
[2017-10-02T12:14:35,554][ERROR][logstash.agent           ] Cannot load an invalid configuration {:reason=>"Expected one of #, {, ,, ] at line 14, colum     n 65 (byte 727) after filter {\r\n\r\n\tgrok {\r\n\t\tmatch => [ \"message\", \"<%{POSINT:pri_id}>%{SYSLOGTIMESTAMP:log_timestamp} %{HOSTNAME:hostname}      %{WORD:source}: \\[%{DATA:num}\\] %{GREEDYDATA:signature} \\[Impact: %{DATA:impact}\\] From \\\\\"%{DATA:device}\\\\\" %{WORD:seq} %{WORD:day} %{SYSLOGT     IMESTAMP:trigger_timestamp} %{DATA:list_year} %{WORD:time_zone} \\[Classification: %{GREEDYDATA:classification}\\] \\[Priority: %{NUMBER:priority}\\] \\     {%{DATA:protocol}\\} (?<srcip>[0-9]+.[0-9]+.[0-9]+.[0-9]+|N/A):(?<srcport>[0-9]+|N/A) \\(%{DATA:srcname}\\)->(?<dstip>[0-9]+.[0-9]+.[0-9]+.[0-9]+|N/A):(     ?<dstport>[0-9]+|N/A) \\(%{DATA:dstname}\\)\"]\r\n\t}\t\r\n\r\n\tmutate {\r\n\t\tremove_field => [ \"pri_id\", \"num\", \"seq\", \"day\", \"time_zone, \     ""}

But when I run above grok in https://grokdebug.herokuapp.com/ grok pattern successfully executed and display the results

Need help to sort out the issue.

paz · October 2, 2017, 9:30am

The error log shows where exactly the configuration breaks. You forgot an enclosing quote on the time_zone field in

mutate {
	remove_field => [ "pri_id", "num", "seq", "day", "time_zone, "list_year" ]
  }

tharu85 · October 2, 2017, 9:42am

Sorry that was my mistake and I have corrected enclosing quate on time_zone. But after I have correct it gives "_grokparsefailure" as below

{
"@timestamp" => 2017-10-02T09:39:09.196Z,
  "@version" => "1",
      "host" => "192.168.50.15",
   "message" => "<46>Oct  2 09:38:29 SXX-XX-XO SFIMS: [1:34463:3] \"APP-DETECT TeamViewer remote administration tool outbound connection attempt\" [Impact: Potentially Vulnerable] From \"XXX-XO-02\" at Mon Oct  2 09:38:28 2017 UTC [Classification: Potential Corporate Policy Violation] [Priority: 1] {tcp} 192.168.1.111:51523 (unknown)->192.168.1.80:8080 (unknown)",
      "tags" => [
    [0] "_grokparsefailure"
]
}

magnusbaeck · October 4, 2017, 8:38pm

Start with the simplest possible expression (<%{POSINT:pri_id}>) and make sure that works. Continue to add more and more until you've found where it breaks.

tharu85 · October 5, 2017, 1:38am

@magnusbaeck,

As you said I have start to build grok patterns one by one. And fully tested in https://grokdebug.herokuapp.com/ with sample logs and it was correctly extracted all tested data

But when I run same grok patterns in real environment it gives grok failure

magnusbaeck · October 5, 2017, 5:22am

Start with the simplest possible expression in Logstash and build from there. Testing expressions in the grok debugger is useful but if you want it to work in Logstash that's where you should test things.

Now, having a quick look at your grok expression you use multiple GREEDYDATA and DATA patterns. This is a really really bad idea and even if it isn't the cause for your current issue it will result in poor performance. Any grok expression with more than one GREEDYDATA or DATA is very likely inefficient.

system · November 2, 2017, 5:22am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash grokparsefailure - message pattern problem Logstash	3	944	February 16, 2017
Need Help for grok pattern for logstash-2.1.0 Logstash	2	574	July 6, 2017
Grok pattern not being applied in logstash but working on grok debugger Logstash	3	599	September 29, 2017
Grok mis-function on logstash 5.0 Logstash	4	709	July 6, 2017
Grok parsing, still not solved Logstash	15	579	March 13, 2019

Grok Patterns not execute properly on logstash filter

Related topics