Grok regex don't match

Hi everybody,

I'm asking for your help because I can not perform a correct parsing of one of my log. Indeed my original log include regex pattern and I would to delete them. I made a regex whish working fine on https://grokdebug.herokuapp.com/ but when I operate on production I have a grok parse failure.

Below an example of my original message log :

[wabaudit] action="list" type="approvals" user="k789db" client_ip="10.10.10.10" infos=""\n

... and below my filter config file :

    filter {
                    grok {
                            match => { "message" => "\[wabaudit\] action=(\\\")%{GREEDYDATA:action}(\\\") type=(\\\")%{GREEDYDATA:type}(\\\")  user=(\\\")%{GREEDYDATA:user}(\\\") client_ip=(\\\")%{GREEDYDATA:client_ip}(\\\") infos=(\\\")%{GREEDYDATA:infos}(\\\"\\n)" }
                            tag_on_failure => ["_grokparsefailure_F090-WAB.conf"]
                            add_field => { "ES_systemtype" => "WAB" }
                     }
    }

Thank you for your help.

It will be a lot easier to write the regexp if you use single quotes. And you do not need GREEDYDATA.

match => { "message" => '\[wabaudit\] action="%{DATA:action}" type="%{DATA:type}" user="%{DATA:user}" client_ip="%{DATA:client_ip}" infos="%{DATA:infos}"' }

Personally I would switch all the patterns to (?[^"]+). And you can anchor the pattern itself

match => { "message" => '^\[wabaudit\] action="(?<action>[^"]+)" type="(?<type>[^"]+)" user="(?<user>[^"]+)" client_ip="(?<client_ip>[^"]+)" infos="(?<infos>[^"]+)"' }

Hello Badger,

I made a mistake on my original message. The message to parse is :

[wabaudit] action=\"list\" type=\"approvals\"  user=\"k789db\" client_ip=\"10.10.10.10\" infos=\"\"\n

I would to have in my rubydebug output these fields :

"message" => "[wabaudit] action="list" type="approvals" user=" k789db" client_ip="10.10.10.10" infos=""\n"
"action" => "list"
"type" => "approvals"
"user" => "k789db"
"client_ip" => "10.10.10.10"
"infos" => ""

Grok debuger working fine :

Ragards.

Try

    grok { match => { "message" => '^\[wabaudit\] action="(?<action>[^"]*)" type="(?<type>[^"]*)"  user="(?<user>[^"]*)" client_ip="(?<client_ip>[^"]*)" infos="%{GREEDYDATA:infos}' } }

Don't working :confused:

Wait, do you actually have backslashes in the log message? I assumed that was an artifact of the way you were outputting the message.

To parse a line that looks like this

[wabaudit] action=\"list\" type=\"approvals\"  user=\"k789db\" client_ip=\"10.10.10.10\" infos=\"\"

You can use

grok { match => { "message" => '^\[wabaudit\] action=\\"(?<action>[^"]*)\\" type=\\"(?<type>[^"]*)\\"  user=\\"(?<user>[^"]*)\\" client_ip=\\"(?<client_ip>[^"]*)\\" infos=\\"%{GREEDYDATA:infos}' } }

which will get you

     "infos" => "\\\"\n",
      "user" => "k789db",
    "action" => "list",
 "client_ip" => "10.10.10.10",
   "message" => "[wabaudit] action=\\\"list\\\" type=\\\"approvals\\\"  user=\\\"k789db\\\" client_ip=\\\"10.10.10.10\\\" infos=\\\"\\\"\n",
      "type" => "approvals"

I don't have this result with your grok pattern.
Indeed, I have backslashes in the log message.
I have performed a tcpdump on the ethernet logstash interface, and I don't have these backslashes. I think Logstash insert these backslashes when the log crossing the udp input plugin. Now I would delete these backslashes to extract the informations.

Below the complete config file :

input {
   udp {
      port => 2514
      codec => plain { charset => "UTF-8" }
   }
}

filter {
   grok {
      match => { "message" => "\[wabaudit\] action=(\\\")%{GREEDYDATA:action}(\\\") type=(\\\")%{GREEDYDATA:type}(\\\")  user=(\\\")%{GREEDYDATA:user}(\\\") client_ip=(\\\")%{GREEDYDATA:client_ip}(\\\") infos=(\\\")%{GREEDYDATA:infos}(\\\"\\n)" }
      match => { "message" => '^\[wabaudit\] action=\\"(?<action>[^"]*)\\" type=\\"(?<type>[^"]*)\\"  user=\\"(?<user>[^"]*)\\" client_ip=\\"(?<client_ip>[^"]*)\\" infos=\\"%{GREEDYDATA:infos}' }
      tag_on_failure => ["_grokparsefailure_F090-WAB.conf"]
      add_field => { "ES_systemtype" => "WAB" }
   }
}

output {
   file {
      codec => rubydebug
      path => "/var/log/logstash/1.log"
   }
}

That suggest you do not have backslashes in your message. The rubydebug codec adds them. If you run this configuration

input { generator { count => 1 message => '[wabaudit] action=\"list\"' } }
input { generator { count => 1 message => '[wabaudit] action="list"' } }

output { stdout { codec => rubydebug { metadata => false } } }

you will get

   "message" => "[wabaudit] action=\\\"list\\\""
   "message" => "[wabaudit] action=\"list\""

Hello Badger,

Thank you for your help ! I was conviced that the backslashes were add in input plugin.
I rewrite my grok pattern like this :

match => { "message" => '\[wabaudit\] action="(?<action>[^"]*)" type="(?<type>[^"]*)"  user="(?<user>[^"]*)" client_ip="(?<client_ip>[^"]*)" infos="(?<infos>[^"]*)"' }

It works perfectly !

Have a good day.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.