ndg  
                (Nahuel Facundo Delle Grazie)
               
                 
              
                  
                    June 26, 2020,  2:37pm
                   
                   
              1 
               
             
            
              Hi! I want to parse a Haproxy log sent by a rsyslog service. I have a grok pattern to parse the original rsyslog message (which send the message of the service to a syslog_message field) and another that parse the syslog_message. The first pattern is working like a charm but the second one is giving me the "_grokparsefailure" tag. 
I used this 2 well known web to check my patterns and they work fine in them: 
http://grokconstructor.appspot.com/do/match  
https://grokdebug.herokuapp.com/  
Here is the output of logstash:
{
                "type" => "syslog",
    "syslog_timestamp" => "Jun 26 11:29:24",
               "bytes" => "13564",
             "message" => "<150>Jun 26 11:29:24 HAPROXYSERVER haproxy[13564]: 1.1.1.1:44731 [26/Jun/2020:11:29:24.705] FRONTEND~ BACKEND/BACKENNODE 51/0/0/31/82 200 3898 - - ---- 1/1/0/1/0 0/0 {|SERVER.DOMAIN.COM|} \"GET / HTTP/1.1\"",
            "@version" => "1",
     "syslog_hostname" => "HAPROXYSERVER",
                "host" => "172.31.28.31",
          "syslog_pri" => "150",
      "syslog_message" => "1.1.1.1:44731 [26/Jun/2020:11:29:24.705] FRONTEND~ BACKEND/BACKENNODE 51/0/0/31/82 200 3898 - - ---- 1/1/0/1/0 0/0 {|SERVER.DOMAIN.COM|} \"GET / HTTP/1.1\"",
                "port" => 26846,
      "syslog_service" => "haproxy",
                "tags" => [
        [0] "haproxy",
        [1] "_grokparsefailure"
    ],
          "@timestamp" => 2020-06-26T14:29:14.202Z
}
 
Here is the config
input {
  tcp {
    port => 5142
    type => syslog
    tags => ["haproxy"]
  }
}
filter {
  if "haproxy" in [tags] {
    grok {
      match => { "message" => '<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{WORD:syslog_service}\[%{NUMBER:bytes}\]: %{GREEDYDATA:syslog_message}' }
    }
  }
  grok {
       match => { "syslog_message" => '%{IPV4:clientip}:%{POSINT:clientport} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_request;int}/%{NUMBER:time_queue;int}/%{NUMBER:time_backend_connect;int}/%{NUMBER:time_backend_response;int}/%{NUMBER:time_duration;int} %{NUMBER:http_status_code;int} %{NOTSPACE:bytes_read} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{NUMBER:actconn}/%{NUMBER:feconn}/%{NUMBER:beconn}/%{NUMBER:srvconn}/%{NOTSPACE:retries} %{NUMBER:srv_queue}/%{NUMBER:backend_queue} \{%{GREEDYDATA:Header}\} \\"(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?\\"' }
  }
}
output {
    stdout { codec => rubydebug }
}
 
Why is my second pattern not working on logstash?
             
            
               
               
               
            
            
           
          
            
              
                Badger  
                
               
              
                  
                    June 26, 2020,  3:00pm
                   
                   
              2 
               
             
            
              
The \\" should be \".
I was surprised that the ;int worked. It would be more normal to use :int
             
            
               
               
               
            
            
           
          
            
              
                ndg  
                (Nahuel Facundo Delle Grazie)
               
              
                  
                    June 26, 2020,  3:34pm
                   
                   
              3 
               
             
            
              Made the change but it remains the same:
       match => { "syslog_message" => '%{IPV4:clientip}:%{POSINT:clientport} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_request:int}/%{NUMBER:time_queue:int}/%{NUMBER:time_backend_connect:int}/%{NUMBER:time_backend_response:int}/%{NUMBER:time_duration:int} %{NUMBER:http_status_code:int} %{NOTSPACE:bytes_read:int} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{NUMBER:actconn}/%{NUMBER:feconn}/%{NUMBER:beconn}/%{NUMBER:srvconn}/%{NOTSPACE:retries} %{NUMBER:srv_queue}/%{NUMBER:backend_queue} \{%{GREEDYDATA:Header}\} \".(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?\".' }
 
Also, that pattern is not working in neither of the URLs that i mentioned before. 
I changed the ;int for :int too just in case.
             
            
               
               
               
            
            
           
          
            
              
                Badger  
                
               
              
                  
                    June 26, 2020,  3:39pm
                   
                   
              4 
               
             
            
              Not sure what to say. This works for me...
input { generator { count => 1 lines => [ '1.1.1.1:44731 [26/Jun/2020:11:29:24.705] FRONTEND~ BACKEND/BACKENNODE 51/0/0/31/82 200 3898 - - ---- 1/1/0/1/0 0/0 {|SERVER.DOMAIN.COM|} "GET / HTTP/1.1"' ] } }
filter {
    mutate { add_field => { "syslog_message" => "%{message}" } }
    grok {
        match => { "syslog_message" => '%{IPV4:clientip}:%{POSINT:clientport} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_request;int}/%{NUMBER:time_queue;int}/%{NUMBER:time_backend_connect;int}/%{NUMBER:time_backend_response;int}/%{NUMBER:time_duration;int} %{NUMBER:http_status_code;int} %{NOTSPACE:bytes_read} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{NUMBER:actconn}/%{NUMBER:feconn}/%{NUMBER:beconn}/%{NUMBER:srvconn}/%{NOTSPACE:retries} %{NUMBER:srv_queue}/%{NUMBER:backend_queue} \{%{GREEDYDATA:Header}\} \"(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?\"' }
    }
 
That produces
          "syslog_message" => "1.1.1.1:44731 [26/Jun/2020:11:29:24.705] FRONTEND~ BACKEND/BACKENNODE 51/0/0/31/82 200 3898 - - ---- 1/1/0/1/0 0/0 {|SERVER.DOMAIN.COM|} \"GET / HTTP/1.1\"",
             "server_name" => "BACKENNODE",
   "time_backend_response" => "31",
            "backend_name" => "BACKEND",
                 "actconn" => "1",
            "http_version" => "1.1",
 
etc.
             
            
               
               
               
            
            
           
          
            
              
                ndg  
                (Nahuel Facundo Delle Grazie)
               
              
                  
                    June 26, 2020,  6:43pm
                   
                   
              5 
               
             
            
              I managed to make it work. When you mentioned me the error at \" i asummed that i had to corrected it with ". (i was including the final punctuation) 
I run your last suggestion on a test file, put the stout as the output and it worked, so i checked to see your pattern against mine and you were using " instead of ". like i did. I hope that i am making myself clear. 
Now is working fine. 
One last question, is there a link where i can see information about how to deal with symbols and such on the logstash grok? I looked for information about it when i was creating my configuration files and the only place where it was explained in some way was in this forum.
             
            
               
               
               
            
            
           
          
            
              
                Badger  
                
               
              
                  
                    June 26, 2020,  7:21pm
                   
                   
              6 
               
             
            
              The grok documentation  provides some information, and links to the documentation for the Oniguruma  library that grok is built upon.
             
            
               
               
               
            
            
           
          
            
              
                system  
                (system)
                  Closed 
               
              
                  
                    July 24, 2020,  7:21pm
                   
                   
              7 
               
             
            
              This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.