Grok error

Hi All,
I'm new to logstash / grok and I'm getting errors with my grok statement that I was hoping somebody could help me with?

My log entry looks like this:
50.60.70.80 [08/Aug/2018:16:25:48 +0100] 50.60.70.72 /Common/Application_HTTPS/VirtualServer /Common/Application_HTTPS/Service_Pool 50.60.70.87 \"\" \"\" 13897 6365 \"\" \"\"

And this is my conf file. The first grok statement is what should be handling this traffic:
filter{

### GROK Statement to catch Traffic from *** Application ###
if [type] == "syslog" {
  if "Application_HTTPS" in [message] {
   grok {
    match => { "message" => "%{URIHOST:client_ip} %{SYSLOG5424SD:timestamp} %{IP:virtual_ip} %{URIPATHPARAM:virtual_name} %{URIPATHPARAM:virtual_pool_name} %{IP:server} %{NUMBER:server_port} \\\"\\\" \\\"\\\" %{NUMBER:packet_size:bytes:int} %{NUMBER:response:ms:int} \\\"\\\" \\\"\\\""}}

	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		}
	json {
		source => "translation"
}
} else {
grok {
  match => { "message" => "%{IP:clientip} \[%{HTTPDATE:timestamp}\] %{IP:virtual_ip} %{DATA:virtual_name} %{DATA:virtual_pool_name} %{DATA:server} %{NUMBER:server_port} \"%{DATA:path}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response:int} %{NUMBER:bytes:int} %{NUMBER:response_ms:int} %{QS:referrer} %{QS:agent}"}}

### TRANSLATE statement to convert incoming IP Addresses to Geo-Location using jsontranslate.yml ###
	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		}
 	json {
		source => "translation"
}
}
}
}


output { 
	elasticsearch {
	  hosts => ["10.128.10.10:9200"] 
	  index => "logstash-%{+YYYY.MM.dd}"
}
	
	stdout { codec => rubydebug} 
}

I've tested this on https://grokdebug.herokuapp.com/ and it works well but not in logstash.
Message:
50.60.70.80 [08/Aug/2018:16:25:48 +0100] 50.60.70.72 /Common/Application/VirtualServer /Common/Application/Service_Pool 50.60.70.87 443 \"\" \"\" 13897 6365 \"\" \"\"

Pattern:
%{URIHOST:client_ip} %{SYSLOG5424SD:timestamp} %{IP:virtual_ip} %{URIPATHPARAM:virtual_name} %{URIPATHPARAM:virtual_pool_name} %{IP:server} %{NUMBER:server_port} \\\"\\\" \\\"\\\" %{NUMBER:packet_size:bytes:int} %{NUMBER:response:ms:int} \\\"\\\" \\\"\\\"

Response:
{
"client_ip": [
[
"50.60.70.80"
]
],
"IPORHOST": [
[
"50.60.70.80"
]
],
"HOSTNAME": [
[
"50.60.70.80"
]
],
"IP": [
[
null
]
],
"IPV6": [
[
null,
null,
null
]
],
"IPV4": [
[
null,
"50.60.70.72",
"50.60.70.87"
]
],
"port": [
[
null
]
],
"timestamp": [
[
"[08/Aug/2018:16:25:48 +0100]"
]
],
"DATA": [
[
"08/Aug/2018:16:25:48 +0100"
]
],
"virtual_ip": [
[
"50.60.70.72"
]
],
"virtual_name": [
[
"/Common/Application/VirtualServer"
]
],
"URIPATH": [
[
"/Common/Application/VirtualServer",
"/Common/Application/Service_Pool"
]
],
"URIPARAM": [
[
null,
null
]
],
"virtual_pool_name": [
[
"/Common/Application/Service_Pool"
]
],
"server": [
[
"50.60.70.87"
]
],
"server_port": [
[
"443"
]
],
"BASE10NUM": [
[
"443",
"13897",
"6365"
]
],
"packet_size": [
[
"13897"
]
],
"response": [
[
"6365"
]
]
}

\\\"\\\" \\\"\\\"

This is probably why it doesn't work. Just use the QS pattern to match this.

Thank you very much for the info.
I've tried using %{QS} to capture these but it is not working.

Would it be possible to get an example? I really appreciate the assistance

Okay, then it might be something else. Build your expression gradually by starting with the simplest possible expression (^%{URIHOST:client_ip}) . Verify that it works and continue building your expression step by step until things break.

Hi Magnus,
Thank you for the feedback. I resolved the grok syntax issue by using the DATA pattern but I then stumbled across another issue. This grok works when it is the only expression in the conf file but not when I have and else statement. I obviously have the syntax wrong somewhere but I was hoping you could have a look because I can't find the issue?

Everything runs through the first grok statement and not the second.

#input { stdin {} }
input {
	beats {
	  port => 5044
}
	tcp {
	  port => 5000
	  type => syslog
}
	udp {
	  port => 5000
	  type => syslog
}
}

filter{

### GROK Statement to catch Application Traffic from the F5 ###
if [type] == "syslog" {
  if "Application_HTTPS" in [message] {
   grok {
    match => { "message" => "%{URIHOST:client_ip} %{SYSLOG5424SD:timestamp} %{IP:virtual_ip} %{URIPATHPARAM:virtual_name} %{URIPATHPARAM:virtual_pool_name} %{IP:server} %{NUMBER:server_port} %{DATA:junk} %{DATA:junk2} %{NUMBER:packet_size:bytes:int} %{NUMBER:response_ms:ms:int} %{DATA:junk3} %{DATA:junk4}"}}

	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		}
	json {
		source => "translation"
}
} else {
grok {
  match => { "message" => "%{IP:clientip} \[%{HTTPDATE:timestamp}\] %{IP:virtual_ip} %{DATA:virtual_name} %{DATA:virtual_pool_name} %{DATA:server} %{NUMBER:server_port} \"{DATA:path}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response:int} %{NUMBER:bytes:int} %{NUMBER:response_ms:int} %{QS:referrer} %{QS:agent}"}}

	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		}
	json {
		source => "translation"
}
}
}
}
output { 
	elasticsearch {
	  hosts => ["x.x.x.x:9200"] 
	  index => "logstash-%{+YYYY.MM.dd}"
}
	
	stdout { codec => rubydebug} 
}

Any help would be greatly appreciated

I don't understand. As you've written your configuration a message will obviously never hit both grok filters, which seems to make sense since the grok expressions are nearly the same.

Thanks Magnus,
Could you make any suggestions as to how I could fix this?

What I want is for every message containing "Application_HTTPS" to be pushed through the first grok filter and everything else to be pushed through the second.

Thank you for your patience on this topic, this is very new to me

So messages having the "syslog" type and message containing "Application_HTTPS" should be processed by the first conditional and all other messages should be processed by the second grok?

Yes that is what is needed for now. In truth, I will most likely be putting in other filters going forward but that is all that is needed for now.
I think if I get this working, I have a better chance of working out future changes without bothering you!

if [type] == "syslog" and "Application_HTTPS" in [message] {
  grok {
    ...
  }
} else {
  grok {
    ...
  }
}

Actually, I was wrong. This hasn't resolved the issue!
Both filters work when they are on the only one in the conf file but everything is still being run through the first filter and I am getting _grokparsefailure on everything that should be going to the second filter.

filter{

### GROK Statement to catch Application Traffic from the F5 ###
if [type] == "syslog" and "Application_HTTPS" in [message] {
   grok {
    match => { "message" => "%{URIHOST:client_ip} %{SYSLOG5424SD:timestamp} %{IP:virtual_ip} %{URIPATHPARAM:virtual_name} %{URIPATHPARAM:virtual_pool_name} %{IP:server} %{NUMBER:server_port} %{DATA:junk} %{DATA:junk2} %{NUMBER:packet_size:bytes:int} %{NUMBER:response_ms:ms:int} %{DATA:junk3} %{DATA:junk4}"}}

	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		}
	json {
		source => "translation"
}
} else {
  grok {
  match => { "message" => "%{IP:clientip} \[%{HTTPDATE:timestamp}\] %{IP:virtual_ip} %{DATA:virtual_name} %{DATA:virtual_pool_name} %{DATA:server} %{NUMBER:server_port} \"{DATA:path}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response:int} %{NUMBER:bytes:int} %{NUMBER:response_ms:int} %{QS:referrer} %{QS:agent}"}}

	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		}
	json {
		source => "translation"
}
}
}

Please show an example event that's incorrectly processed. Use a stdout { codec => rubydebug } output or copy/paste from the JSON tab in Kibana's Discover panel.

Hi Magnus,

This is the JSON from one of the failed events.

{
  "_index": "logstash-2018.08.15",
  "_type": "doc",
  "_id": "HycAPWUBO92FmnxQkglH",
  "_version": 1,
  "_score": null,
  "_source": {
    "type": "syslog",
    "@version": "1",
    "host": "10.128.00.000",
    "tags": [
      "_grokparsefailure"
    ],
    "translation": "{\"geoip\": {\"lat\": 54.111111, \"lon\": -5.111111, \"location\": [-5.111111, 54.111111]}}",
    "geoip": {
      "lat": 54.111111,
      "location": [
        -5.111111,
        54.111111
      ],
      "lon": -5.111111
    },
    "@timestamp": "2018-08-15T09:52:50.229Z",
    "message": "50.60.70.80 [15/Aug/2018:10:52:52 +0100] 10.120.10.10 /Common/Application_HTTPS_UAT_SYNC_AppService.app/Application_HTTPS_UAT_SYNC_AppService_vs /Common/Application_HTTPS_UAT_SYNC_AppService.app/Application_HTTPS_UAT_SYNC_AppService_pool 10.120.10.10 443 \"\" \"\"  2020 23 \"\" \"\""
  },
  "fields": {
    "@timestamp": [
      "2018-08-15T09:52:50.229Z"
    ]
  },
  "highlight": {
    "message": [
      "50.64.194.85 [15/Aug/2018:10:52:52 +0100] 10.128.15.71 /Common/@kibana-highlighted-field@Application_HTTPS_UAT_SYNC_AppService.app@/kibana-highlighted-field@/@kibana-highlighted-field@Application_HTTPS_UAT_SYNC_AppService_vs@/kibana-highlighted-field@ /Common/@kibana-highlighted-field@Application_HTTPS_UAT_SYNC_AppService.app@/kibana-highlighted-field@/@kibana-highlighted-field@Application_HTTPS_UAT_SYNC_AppService_pool@/kibana-highlighted-field@ 10.128.14.61 443 \"\" \"\"  2020 23 \"\" \"\""
    ]
  },
  "sort": [
    1534326770229
  ]
}

This example is hitting the first of the two sets of filters, not the second, because the

if [type] == "syslog" and "Application_HTTPS" in [message] {

conditional is true. Is that what you expected?

The advice I gave earlier still applies; build your grok expressions gradually so that it's clear what part of the expression is breaking the matching.

Hi Magnus,
Yes, the example request should hit the first filter but it doesn't seem to be working correctly as I am not getting the additional fields in kibana.

The grok is fine, the problem is with the if / else statement. If I keep the first filter and remove the second, then these messages process correctly.

Please set the tag_on_failure option in your grok filters to a different string for each filter so we can see with certainty which grok filter is failing.

Hi Magnus,
I have made progress with this (I think ! ). I have 3 applications feeding in to logstash right now. Application 1 and 3 are fine and messages are being processed correctly.
Application 2 can give different messages. Most have HTTP response codes but some do not, so I have used an if / else statement with 2 filters. Messages with a response code go through "Application 2 Filter 1" successfully but the messages with no HTTP response code should be going through "Application 2 Filter 2" but it fails on "Application 2 Filter 1". "Application 2 Filter 2" works when it is the only filter in the conf file so it must be something to do with the if /else.
Could you look at my conf file and see if you notice any errors please?

#input { stdin {} }
input {
	beats {
	  port => 5044
}
	tcp {
	  port => 5000
	  type => syslog
}
	udp {
	  port => 5000
	  type => syslog
}
}

filter{

### GROK Statement to catch Application1 Traffic from the F5 ###
if [type] == "syslog" and "PULSE" in [message] {
   grok {
    match => { "message" => "%{URIHOST:client_ip} %{SYSLOG5424SD:timestamp} %{IP:virtual_ip} %{URIPATHPARAM:virtual_name} %{URIPATHPARAM:virtual_pool_name} %{IP:server} %{NUMBER:server_port} %{DATA:junk} %{DATA:junk2} %{NUMBER:packet_size:bytes:int} %{NUMBER:response_ms:ms:int} %{DATA:junk3} %{DATA:junk4}"
             }
tag_on_failure => ["Failed on Filter 1"]
        }   

	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		}
	json {
		source => "translation"
             }
                                                     }

### Application2 Filter 1 - To be used where messages have response codes (i.e. 200, 404 etc) ###
else if [type] == "syslog" {
if "Application2" in [message] {
 grok {
  match => { "message" => "%{IP:clientip} \[%{HTTPDATE:timestamp}\] %{IP:virtual_ip} %{DATA:virtual_name} %{DATA:virtual_pool_name} %{IP:Server} %{NUMBER:server_port} \"%{DATA:path}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response:int} %{NUMBER:bytes:int} %{NUMBER:response_ms:int} %{QS:referrer} %{QS:agent}" 
           }
tag_on_failure => ["Failed on Application2 Filter 1"]
      }    

	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		  }
	json {
		source => "translation"
             }
		      }
						
### Application2 Filter 2 - To be used where messages do not have response codes ###
else {
 grok {
  match => { "message" => "%{IP:clientip} \[%{HTTPDATE:timestamp}\] %{IP:virtual_ip} %{DATA:virtual_name} %{DATA:virtual_pool_name} %{IP:Server} %{NUMBER:server_port} \"%{DATA:path}\" \"(?:%{WORD:verb} %{NOTSPACE:request} (?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:bytes:int} %{NUMBER:response_ms_int} %{QS:referrer} %{QS:agent}"
	   }
tag_on_failure => ["Failed on Application2 Filter 2"]
      }

	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		  }
	json {
		source => "translation"
	     }

  						   }						   	
}
else if [type] == "syslog" and "Application3" in [message] {
grok {
  match => { "message" => "%{IP:clientip) \[%{HTTPDATE:timestamp}\] %{IP:virtual_ip} %{DATA:virtual_name} %{DATA:virtual_pool_name %{DATA:server} %{NUMBER:server_port} %{DATA:junk} %{DATA:junk2} %{NUMBER:packet_size:int} %{NUMBER:response_ms:ms:int} %{DATA:junk3} %{DATA:junk4}"
           }
  tag_on_failure => ["Failed on Filter 3 - Application3 Filter"]
     }
	translate {
		regex => true
		dictionary_path => "/etc/logstash/jsontranslate.yml"
		field => "message"
		  }
	json {
		source => "translation"
	     }
                                                       }
      

}
output { 
	elasticsearch {
	  hosts => ["xx.xxx.xx.xx:9200"] 
	  index => "logstash-%{+YYYY.MM.dd}"
}
	
	stdout { codec => rubydebug} 
}

An example of some failed messages:

{
       "message" => "xx.xxx.xx.xxx [22/Aug/2018:11:00:27 +0100] xx.xxx.xx.xx /Common/Application2_HTTPS_PT /Common/Application2_HTTP_PT xx.xxx.xx.xx 8080 \"/share/service/modules/authenticated\" \"GET /share/service/modules/authenticated?noCache=1534153205458&a=user HTTP/1.1\"  0 0 \"https://Application2pt.domain.local/share/page/user/admin/dashboard\" \"Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0\"",
          "host" => "xx.xxx.xx.xxx",
      "@version" => "1",
          "tags" => [
        [0] "Failed on Application2 Filter 1"
    ],
    "@timestamp" => 2018-08-22T10:00:22.731Z,
          "type" => "syslog"
}
{
       "message" => "xx.xxx.xx.xxx [22/Aug/2018:11:00:27 +0100] xx.xxx.xx.xx /Common/Application2_HTTPS_PT /Common/Application2_HTTP_PT xx.xxx.xx.xx 8080 \"/share/service/modules/authenticated\" \"GET /share/service/modules/authenticated?noCache=1534153205458&a=user HTTP/1.1\"  0 0 \"https://Application2pt.domain.local/share/page/user/admin/dashboard\" \"Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0\"",
          "host" => "xx.xxx.xx.xxx",
      "@version" => "1",
          "tags" => [
        [0] "Failed on Application2 Filter 1"

For the last time, build your grok expressions gradually so that it's clear what part of the expression is breaking the matching.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.