Grok pattern not working

Hi! I want to parse a Haproxy log sent by a rsyslog service. I have a grok pattern to parse the original rsyslog message (which send the message of the service to a syslog_message field) and another that parse the syslog_message. The first pattern is working like a charm but the second one is giving me the "_grokparsefailure" tag.
I used this 2 well known web to check my patterns and they work fine in them:
http://grokconstructor.appspot.com/do/match
https://grokdebug.herokuapp.com/
Here is the output of logstash:

{
                "type" => "syslog",
    "syslog_timestamp" => "Jun 26 11:29:24",
               "bytes" => "13564",
             "message" => "<150>Jun 26 11:29:24 HAPROXYSERVER haproxy[13564]: 1.1.1.1:44731 [26/Jun/2020:11:29:24.705] FRONTEND~ BACKEND/BACKENNODE 51/0/0/31/82 200 3898 - - ---- 1/1/0/1/0 0/0 {|SERVER.DOMAIN.COM|} \"GET / HTTP/1.1\"",
            "@version" => "1",
     "syslog_hostname" => "HAPROXYSERVER",
                "host" => "172.31.28.31",
          "syslog_pri" => "150",
      "syslog_message" => "1.1.1.1:44731 [26/Jun/2020:11:29:24.705] FRONTEND~ BACKEND/BACKENNODE 51/0/0/31/82 200 3898 - - ---- 1/1/0/1/0 0/0 {|SERVER.DOMAIN.COM|} \"GET / HTTP/1.1\"",
                "port" => 26846,
      "syslog_service" => "haproxy",
                "tags" => [
        [0] "haproxy",
        [1] "_grokparsefailure"
    ],
          "@timestamp" => 2020-06-26T14:29:14.202Z
}

Here is the config

input {
  tcp {
    port => 5142
    type => syslog
    tags => ["haproxy"]
  }
}

filter {
  if "haproxy" in [tags] {
    grok {
      match => { "message" => '<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{WORD:syslog_service}\[%{NUMBER:bytes}\]: %{GREEDYDATA:syslog_message}' }
    }
  }
  grok {
       match => { "syslog_message" => '%{IPV4:clientip}:%{POSINT:clientport} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_request;int}/%{NUMBER:time_queue;int}/%{NUMBER:time_backend_connect;int}/%{NUMBER:time_backend_response;int}/%{NUMBER:time_duration;int} %{NUMBER:http_status_code;int} %{NOTSPACE:bytes_read} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{NUMBER:actconn}/%{NUMBER:feconn}/%{NUMBER:beconn}/%{NUMBER:srvconn}/%{NOTSPACE:retries} %{NUMBER:srv_queue}/%{NUMBER:backend_queue} \{%{GREEDYDATA:Header}\} \\"(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?\\"' }
  }
}


output {
    stdout { codec => rubydebug }
}

Why is my second pattern not working on logstash?

The \\" should be \".

I was surprised that the ;int worked. It would be more normal to use :int

Made the change but it remains the same:

       match => { "syslog_message" => '%{IPV4:clientip}:%{POSINT:clientport} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_request:int}/%{NUMBER:time_queue:int}/%{NUMBER:time_backend_connect:int}/%{NUMBER:time_backend_response:int}/%{NUMBER:time_duration:int} %{NUMBER:http_status_code:int} %{NOTSPACE:bytes_read:int} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{NUMBER:actconn}/%{NUMBER:feconn}/%{NUMBER:beconn}/%{NUMBER:srvconn}/%{NOTSPACE:retries} %{NUMBER:srv_queue}/%{NUMBER:backend_queue} \{%{GREEDYDATA:Header}\} \".(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?\".' }

Also, that pattern is not working in neither of the URLs that i mentioned before.
I changed the ;int for :int too just in case.

Not sure what to say. This works for me...

input { generator { count => 1 lines => [ '1.1.1.1:44731 [26/Jun/2020:11:29:24.705] FRONTEND~ BACKEND/BACKENNODE 51/0/0/31/82 200 3898 - - ---- 1/1/0/1/0 0/0 {|SERVER.DOMAIN.COM|} "GET / HTTP/1.1"' ] } }
filter {
    mutate { add_field => { "syslog_message" => "%{message}" } }
    grok {
        match => { "syslog_message" => '%{IPV4:clientip}:%{POSINT:clientport} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_request;int}/%{NUMBER:time_queue;int}/%{NUMBER:time_backend_connect;int}/%{NUMBER:time_backend_response;int}/%{NUMBER:time_duration;int} %{NUMBER:http_status_code;int} %{NOTSPACE:bytes_read} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{NUMBER:actconn}/%{NUMBER:feconn}/%{NUMBER:beconn}/%{NUMBER:srvconn}/%{NOTSPACE:retries} %{NUMBER:srv_queue}/%{NUMBER:backend_queue} \{%{GREEDYDATA:Header}\} \"(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?\"' }
    }

That produces

          "syslog_message" => "1.1.1.1:44731 [26/Jun/2020:11:29:24.705] FRONTEND~ BACKEND/BACKENNODE 51/0/0/31/82 200 3898 - - ---- 1/1/0/1/0 0/0 {|SERVER.DOMAIN.COM|} \"GET / HTTP/1.1\"",
             "server_name" => "BACKENNODE",
   "time_backend_response" => "31",
            "backend_name" => "BACKEND",
                 "actconn" => "1",
            "http_version" => "1.1",

etc.

I managed to make it work. When you mentioned me the error at \" i asummed that i had to corrected it with ". (i was including the final punctuation)
I run your last suggestion on a test file, put the stout as the output and it worked, so i checked to see your pattern against mine and you were using " instead of ". like i did. I hope that i am making myself clear.
Now is working fine.
One last question, is there a link where i can see information about how to deal with symbols and such on the logstash grok? I looked for information about it when i was creating my configuration files and the only place where it was explained in some way was in this forum.

The grok documentation provides some information, and links to the documentation for the Oniguruma library that grok is built upon.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.