Using Grok: How To Parse Multiple Entries With The Same Field Name?

rudyamid · June 7, 2019, 10:36pm

I've been searching for a pre-made Grok pattern for Apache's mod_security error log, and couldn't find any. An example of a log entry:

Mon Jun 03 15:07:12.453090 2019] [:error] [pid 15595] [client 192.168.0.254:57318] [client 192.168.0.254] ModSecurity: Warning. Matched phrase "bin/bash" at ARGS:exec. [file "/etc/httpd/modsecurity.d/activated_rules/REQUEST-932-APPLICATION-ATTACK-RCE.conf"] [line "500"] [id "932160"] [msg "Remote Command Execution: Unix Shell Code Found"] [data "Matched Data: bin/bash found within ARGS:exec: /bin/bash"] [severity "CRITICAL"] [ver "OWASP_CRS/3.1.0"] [tag "application-multi"] [tag "language-shell"] [tag "platform-unix"] [tag "attack-rce"] [tag "OWASP_CRS/WEB_ATTACK/COMMAND_INJECTION"] [tag "WASCTC/WASC-31"] [tag "OWASP_TOP_10/A1"] [tag "PCI/6.5.2"] [hostname "test.us.mydomain.com"] [uri "/images/random/random-logo.png"] [unique_id "XPWaEGNi1xVn2c58vCAiEwAAAAM"]

So here's my attempt to come up with the rule:

\[%{HTTPDERROR_DATE:timestamp}\] \[(%{WORD:module})?:%{LOGLEVEL:loglevel}\] \[pid %{POSINT:pid}(:tid %{NUMBER:tid})?\] \[client %{IPORHOST:clientip}:%{POSINT:clientport}\] \[client %{IPORHOST:cip2}\] %{WORD:errorsource}: %{DATA:errormsg} \[file \"%{DATA:rulefilename}\"\] \[line \"%{POSINT:rulelinenum}\"\] \[id \"%{POSINT:ruleid}\"\] \[msg \"%{DATA:rulemsg}\"\] \[data \"%{DATA:ruledata}\"\] \[severity \"%{WORD:ruleseverity}\"\] \[ver \"%{DATA:ruleversion}\"\] \[tag \"%{DATA:ruletag1}\"\] \[tag \"%{DATA:ruletag2}\"\] \[tag \"%{DATA:ruletag3}\"\] \[tag \"%{DATA:ruletag4}\"\] \[tag \"%{DATA:ruletag5}\"\] \[tag \"%{DATA:ruletag6}\"\] \[tag \"%{DATA:ruletag7}\"\] \[tag \"%{DATA:ruletag8}\"\] \[hostname \"%{HOSTNAME:hostname}\"\] \[uri \"%{URIPATHPARAM:uri}\"\] \[unique_id \"%{WORD:uniqueid}\"\]

Sort of brute force approach. However, mod_security actually has further logs (of the same error) into two other different lines:

[Mon Jun 03 15:07:12.454321 2019] [:error] [pid 15595] [client 192.168.0.254:57318] [client 192.168.0.254] ModSecurity: Access denied with code 403 (phase 2). Operator GE matched 5 at TX:anomaly_score. [file "/etc/httpd/modsecurity.d/activated_rules/REQUEST-949-BLOCKING-EVALUATION.conf"] [line "91"] [id "949110"] [msg "Inbound Anomaly Score Exceeded (Total Score: 5)"] [severity "CRITICAL"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-generic"] [hostname "test.us.mydomain.com"] [uri "/images/random/random-logo.png"] [unique_id "XPWaEGNi1xVn2c58vCAiEwAAAAM"]

[Mon Jun 03 15:07:12.454684 2019] [:error] [pid 15595] [client 192.168.0.254:57318] [client 192.168.0.254] ModSecurity: Warning. Operator GE matched 5 at TX:inbound_anomaly_score. [file "/etc/httpd/modsecurity.d/activated_rules/RESPONSE-980-CORRELATION.conf"] [line "86"] [id "980130"] [msg "Inbound Anomaly Score Exceeded (Total Inbound Score: 5 - SQLI=0,XSS=0,RFI=0,LFI=0,RCE=5,PHPI=0,HTTP=0,SESS=0): individual paranoia level scores: 5, 0, 0, 0"] [tag "event-correlation"] [hostname "test.us.mydomain.com"] [uri "/images/random/random-logo.png"] [unique_id "XPWaEGNi1xVn2c58vCAiEwAAAAM"]

Notice the only difference is the # of [tags] words in message. The first has 8, the 2nd has 4, the 3rd has only 1. Obviously my brute force approach won't match. Is there an elegant way to parse those multiple [tags] field names with grok? I wonder if some Ruby parsing magic needs to be applied here.

Thanks in advance.

Badger · June 8, 2019, 12:44am

Do not try to do it all with grok. I would break off the initial common section with dissect, then pull out the ModSecurity message using grok, then chop up the rest using a kv filter. Something like

    dissect { mapping => { "message" => "[%{ts}] [:%{level}] [pid %{pid}] [client %{clientA}] [client %{clientB}] %{[@metadata][restOfLine]}" } }
    grok { match => { "[@metadata][restOfLine]" => [ "ModSecurity: (?<theMessage>[^\[]+ )(?<[@metadata][theRest]>\[.*)" ] } }
    kv { source => "[@metadata][theRest]" field_split => "\]\[" value_split => " " }

grok is one of the most powerful (and popular) filters for parsing events. That's exactly why you should at least consider the rest of the filters to see if something more specific (and therefore cheaper) can do the job.

If you need tag to be an array with a single member when it is a string then I would use a ruby filter for that.

rudyamid · June 11, 2019, 8:00pm

Awesome! Thanks for pointing me in the right direction!

system · July 9, 2019, 8:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Grokparse failure - but grokdebugger works Logstash	4	1035	July 6, 2017
Where can I find the grok pattern for the COMBINEDAPACHELOG in logstash? Logstash	5	6934	July 12, 2020
Getting _grokparsefailure for grok pattern on [audit_data][messages] field for modsecurity json log? Logstash	5	183	February 23, 2024
Grok Filter - alternative patterns for same field? Logstash	3	1521	July 24, 2020
Grok works in debugger but not in Logstash Logstash	4	322	June 7, 2021

Using Grok: How To Parse Multiple Entries With The Same Field Name?

Related topics