Getting _grokparsefailure for grok pattern on [audit_data][messages] field for modsecurity json log?

Input Json

{"transaction":{"time":"26/Jan/2024:00:54:31 +0530","transaction_id":"16645304250678661185","remote_address":"141.98.7.28","remote_port":80,"local_address":"127.0.0.1","local_port":80},"request":{"request_line":"GET / HTTP/1.1","headers":{"Host":"95.217.32.181:80","User-Agent":"Hello World"}},"response":{"protocol":"HTTP/1.1","status":0,"headers":{}},"audit_data":{"messages":["Warning. Pattern match \"^[\\\\d.:]+$\" at REQUEST_HEADERS:Host. [file \"C:\\/Program Files/ModSecurity IIS/owasp_crs/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf\"] [line \"810\"] [id \"920350\"] [rev \"2\"] [msg \"Host header is a numeric IP address\"] [data \"95.217.32.181:80\"] [severity \"WARNING\"] [ver \"OWASP_CRS/3.0.0\"] [maturity \"9\"] [accuracy \"9\"] [tag \"application-multi\"] [tag \"language-multi\"] [tag \"platform-multi\"] [tag \"attack-protocol\"] [tag \"OWASP_CRS/PROTOCOL_VIOLATION/IP_HOST\"] [tag \"WASCTC/WASC-21\"] [tag \"OWASP_TOP_10/A7\"] [tag \"PCI/6.5.10\"]"],"handler":"IIS","stopwatch":{"p1":2047,"p2":1004,"p3":0,"p4":0,"p5":2048,"sr":2047,"sw":0,"l":0,"gc":2048},"producer":["ModSecurity for IIS (STABLE)/2.9.3 (http://www.modsecurity.org/)","OWASP_CRS/3.0.2"],"server":"ModSecurity Standalone","engine_mode":"ENABLED"}}

logstash.conf

input {
  file{
    path => "/tmp/modsec.log"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    codec => json
  }
}
filter {
  split { field => "[audit_data][messages]"}

  mutate { remove_field => ["[event][original]"] }

  mutate { remove_field => ["[request][headers][User-Agent]"] }

  if [response][body] { 
    mutate { remove_field => ["[response][body]"] }
  }

  mutate { remove_field => ["[audit_data][stopwatch]"] }
  mutate { remove_field => ["[audit_data][producer]"] }
  mutate { remove_field => ["[audit_data][server]"] }
  mutate { remove_field => ["[audit_data][engine_mode]"] }
  
  grok {
    match => { "[audit_data][messages]" => '%{GREEDYDATA:audit_message} \[file \\"%{DATA:rule_file}".*msg \\"%{DATA:audit_msg}"\].*\[severity \\"%{DATA:sevirity}"'}
  }

}
output {
 file {
   codec => json
   path => "/tmp/logstash_out.log"
 }
}

All I want to create 4 seperate field from each of [audit_data][messages] which has quotes, blackslashes etc. (This input has only 1 item in array btw.)

["Warning. Pattern match \"^[\\\\d.:]+$\" at REQUEST_HEADERS:Host. [file \"C:\\/Program Files/ModSecurity IIS/owasp_crs/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf\"] [line \"810\"] [id \"920350\"] [rev \"2\"] [msg \"Host header is a numeric IP address\"] [data \"95.217.32.181:80\"] [severity \"WARNING\"] [ver \"OWASP_CRS/3.0.0\"] [maturity \"9\"] [accuracy \"9\"] [tag \"application-multi\"] [tag \"language-multi\"] [tag \"platform-multi\"] [tag \"attack-protocol\"] [tag \"OWASP_CRS/PROTOCOL_VIOLATION/IP_HOST\"] [tag \"WASCTC/WASC-21\"] [tag \"OWASP_TOP_10/A7\"] [tag \"PCI/6.5.10\"]
  1. audit_message
  2. rule_file
  3. audit_msg
  4. Severity

Debugger is working here

I am getting everything else in output file. Only grok pattern not working in logstash. Why ? How do I make it work ?

Whenever I have a problem like this, I like to trim the grok pattern to just One capture, run the pipeline make sure it works. Add another small capture. Run the pipeline. Make sure it works, add another small capture.

Oftentimes there can be small differences between what we think the input data looks like and what it actually looks like. Or there can be special characters or Unicode characters that aren't working as we expect.

The simplest way to figure this out is to just trim your grok and slowly add it back until it stops working again.

You can also log field values from a Ruby filter to make sure they are matching or capturing what you expect at different steps of your pipeline Logging from within Ruby Filter - #3 by guyboertje

Thanks.

Small or Big capture is not the issue. I tried with one small word. Didn't work.

The issue is Logstash java regex is not same as other regex available online.

Logstah simply can not handle escaped json messages but not exactly json.

For example [msg \"Host header is a numeric IP address\"].

I sloved the issue by converting \" into ' with gsub.

This has worked.

  mutate { gsub => [ "[audit_data][messages]", '[\"]', "'"]  }
  
  grok {
    match => {"[audit_data][messages]" => ["%{GREEDYDATA:audit_message}\s\[file\s'%{DATA:rule_file_path}'.+?msg\s'%{DATA:audit_msg}'.+?severity\s'%{DATA:severity}'","%{GREEDYDATA:audit_message}\s\[file\s'%{DATA:rule_file_path}'.+?msg\s'%{DATA:audit_msg}'"]}
  }
  
  mutate { gsub => [ "rule_file_path", '[\\]', ""]  }
   
  grok {
    match => {"rule_file_path" => "C:/Program\sFiles/ModSecurity IIS/owasp_crs/rules/%{DATA:rule_file}$"}
  }

  mutate { remove_field => ["[audit_data][messages]", "rule_file_path"] }
1 Like

So I wasn't suggesting that the size of the grok capture is the cause of the problem.

Because all it takes is a single character to not look how we expect for the grok pattern to fail, I was suggesting to reduce the size of the grok pattern until it starts working to help identify where the issue with the pattern is.

It sounds like you were able to identify the issue which is awesome, but starting with something small and building up will make troubleshooting much easier

Logstash uses Ruby regexps, not Java regexps.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.