Logstash filter plugins not operating on field

I have been working with the notoriously difficult modsecurity_audit.log in order to gather some useful statistics from it in order to tweak our Web Application firewall. So far, I've been able to separate the different sections A, B, E, F, J, and H (among others), and load them into different filter plugins using the material from here: https://github.com/bitsofinfo/logstash-modsecurity. This was very helpful, but as the writer said, it really served as a foundation for what I wanted to do.

Once I had Section H separated out from the rest of the log, I tried performing some operations on just that section. I cut and pasted just the section H material into a separate log and ran it through its own pipeline so that I wouldn't be distracted by the other material:

section.h:

Message: Warning. Pattern match "(?i)[\\s\\S](?:x(?:link:href|html|mlns)|!ENTITY.*?SYSTEM|data:text\\/html|pattern(?=.*?=)|formaction|\\@import|base64)\\b" at ARGS:acceptHeader. [file "/usr/share/modsecurity-crs/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf"] [line "158"] [id "941130"] [rev "2"] [msg "XSS Filter - Category 3: Attribute Vector"] [data "Matched Data: /xhtml found within ARGS:acceptHeader: text/html,application/xhtml xml,application/xml;q=0.9,*/*;q=0.8"] [severity "CRITICAL"] [ver "OWASP_CRS/3.0.0"] [maturity "1"] [accuracy "8"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-xss"] [tag "OWASP_CRS/WEB_ATTACK/XSS"] [tag "WASCTC/WASC-8"] [tag "WASCTC/WASC-22"] [tag "OWASP_TOP_10/A3"] [tag "OWASP_AppSensor/IE1"] [tag "CAPEC-242"]
Message: Warning. Pattern match "(?i)[\\s\\S](?:x(?:link:href|html|mlns)|!ENTITY.*?SYSTEM|data:text\\/html|pattern(?=.*?=)|formaction|\\@import|base64)\\b" at ARGS:acceptHeader. [file "/usr/share/modsecurity-crs/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf"] [line "158"] [id "941130"] [rev "2"] [msg "XSS Filter - Category 3: Attribute Vector"] [data "Matched Data: /xhtml found within ARGS:acceptHeader: text/html,application/xhtml xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"] [severity "CRITICAL"] [ver "OWASP_CRS/3.0.0"] [maturity "1"] [accuracy "8"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-xss"] [tag "OWASP_CRS/WEB_ATTACK/XSS"] [tag "WASCTC/WASC-8"] [tag "WASCTC/WASC-22"] [tag "OWASP_TOP_10/A3"] [tag "OWASP_AppSensor/IE1"] [tag "CAPEC-242"]

using the 'dissect' plugin I was able to separate this into three parts:


input {

  file {
    path => "/var/log/sectionh.log"
    type => "section_h"    }
}

filter {
    dissect { 
       mapping => {
		"message" => "%{?idea}. %{?before}. %{after}"
    }

output {

  # turn this off when ready to run in a 
  # real prod environment and get rid of the   
  # "-v" flag when starting logstash
  stdout { codec => rubydebug }
#  elasticsearch {
#    index => "modsecurity-%{+YYYY.MM.dd}"
#    hosts => ["http://0.0.0.0:9200"]
#  }
#stdout {}

Simple, right? Here is the output:

"host" => "elk-stack",
         "after" => "[file \"/usr/share/modsecurity-crs/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf\"] [line \"158\"] [id \"941130\"] [rev \"2\"] [msg \"XSS Filter - Category 3: Attribute Vector\"] [data \"Matched Data: /xhtml found within ARGS:acceptHeader: text/html,application/xhtml xml,application/xml;q=0.9,*/*;q=0.8\"] [severity \"CRITICAL\"] [ver \"OWASP_CRS/3.0.0\"] [maturity \"1\"] [accuracy \"8\"] [tag \"application-multi\"] [tag \"language-multi\"] [tag \"platform-multi\"] [tag \"attack-xss\"] [tag \"OWASP_CRS/WEB_ATTACK/XSS\"] [tag \"WASCTC/WASC-8\"] [tag \"WASCTC/WASC-22\"] [tag \"OWASP_TOP_10/A3\"] [tag \"OWASP_AppSensor/IE1\"] [tag \"CAPEC-242\"]",
       "message" => "Message: Warning. Pattern match \"(?i)[\\\\s\\\\S](?:x(?:link:href|html|mlns)|!ENTITY.*?SYSTEM|data:text\\\\/html|pattern(?=.*?=)|formaction|\\\\@import|base64)\\\\b\" at ARGS:acceptHeader. [file \"/usr/share/modsecurity-crs/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf\"] [line \"158\"] [id \"941130\"] [rev \"2\"] [msg \"XSS Filter - Category 3: Attribute Vector\"] [data \"Matched Data: /xhtml found within ARGS:acceptHeader: text/html,application/xhtml xml,application/xml;q=0.9,*/*;q=0.8\"] [severity \"CRITICAL\"] [ver \"OWASP_CRS/3.0.0\"] [maturity \"1\"] [accuracy \"8\"] [tag \"application-multi\"] [tag \"language-multi\"] [tag \"platform-multi\"] [tag \"attack-xss\"] [tag \"OWASP_CRS/WEB_ATTACK/XSS\"] [tag \"WASCTC/WASC-8\"] [tag \"WASCTC/WASC-22\"] [tag \"OWASP_TOP_10/A3\"] [tag \"OWASP_AppSensor/IE1\"] [tag \"CAPEC-242\"]",
          "type" => "section_h",
    "@timestamp" => 2020-06-26T13:41:33.496Z,
          "path" => "/var/log/sectionh.log",
      "@version" => "1"
}

As you can see, the "after" field has the information I'm interested in. This is what id like to have from the "after" field to store in Elasticsearch:

"file" => "/usr/share/modsecurity-crs/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf"
"line" => 158 
"id" => "941130 
"rev" => "2" 
"msg" => "XSS Filter - Category 3: Attribute Vector"
"data" =>  "Matched Data: /xhtml found within ARGS:acceptHeader: text/html,application/xhtml xml,application/xml;q=0.9,*/*;q=0.8" 
"severity" => "CRITICAL"
"ver" =>  "OWASP_CRS/3.0.0\" 
"maturity" => "1" 
"accuracy" => "8" 
"tag"  => "application-multi" 
"tag" => "language-multi" 
"tag" => "platform-multi" 
"tag" => "attack-xss"
"tag" =>"OWASP_CRS/WEB_ATTACK/XSS" 
"tag" => "WASCTC/WASC-8" 
"tag" => "WASCTC/WASC-22" 
"tag" => "OWASP_TOP_10/A3" 
"tag" => "OWASP_AppSensor/IE1" [
"tag" => "CAPEC-242"

It doesn't look as though it would be that hard, does it? Yet for some reason, any filter plugin I try to apply to the "after" field doesn't 'take', and the output doesn't change. I have tried both the 'kv' and the 'gsub' plugins, and neither of them modify the "after" fields as my interpretation of the documentation led me to believe that they would.

Here is the full configuration file for the pipeline:

input {

  file {
    path => "/var/log/sectionh.log"
    type => "section_h"

    }
}

filter {
    dissect { 
       mapping => {
                "message" => "%{?idea}. %{?before}. %{after}"
    }
      
 }
     mutate {
       remove_field => [ "message" ]
       gsub => [
            "after", "\[\"", "" ]
    }
}

filter {
   kv {
      source => "after"
      allow_duplicate_values => true    
      include_brackets => false
   }
}

output {

  # turn this off when ready to run in a 
  # real prod environment and get rid of the 
  # "-v" flag when starting logstash
  stdout { codec => rubydebug }
#  elasticsearch {
#    index => "modsecurity-%{+YYYY.MM.dd}"
#    hosts => ["http://0.0.0.0:9200"]
#  }
#stdout {}
}
  
Despite everything, the pluins do not operate on the "after" field.  They don't appear to be operating on anything.  Is there some problem with the nesting or the placement?  I have been working on this for a week.

Thanking you all in anticipation:

I do not think the mutate+gsub has any effect. The pattern does not match. You can parse the [after] field using

    kv {
        source => "after"
        value_split => " "
        field_split_pattern => "\] \[|^\[|\]$"
        allow_duplicate_values => true
    }

which produces

       "tag" => [
    [0] "application-multi",
    [1] "language-multi",
    [2] "platform-multi",
    [3] "attack-xss",
    [4] "OWASP_CRS/WEB_ATTACK/XSS",
    [5] "WASCTC/WASC-8",
    [6] "WASCTC/WASC-22",
    [7] "OWASP_TOP_10/A3",
    [8] "OWASP_AppSensor/IE1",
    [9] "CAPEC-242"
],
      "file" => "/usr/share/modsecurity-crs/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf",
        "id" => "941130",
       "rev" => "2",
      "line" => "158",
       "msg" => "XSS Filter - Category 3: Attribute Vector",
  "severity" => "CRITICAL",
       "ver" => "OWASP_CRS/3.0.0",
  "maturity" => "1",
      "data" => "Matched Data: /xhtml found within ARGS:acceptHeader: text/html,application/xhtml xml,application/xml;q=0.9,*/*;q=0.8",
  "accuracy" => "8",

Thank you, Badger. That worked perfectly.