Grok parse failures for ssl_request_log (but none in debugger)


(Adam Stallard) #1

I have a log file from apache ssl_request_log which is a slightly different format from every other one in that it swaps the ordering of various fields. So I copied the macro from %{COMBINEDAPACHELOG} and made a few changes to get this:

match => { "message" => "\[%{HTTPDATE:timestamp}\] %{IPORHOST:clientip} %{NOTSPACE:tls_version} %{NOTSPACE:crypto_algorithm} \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response}" }

I've pasted the regex (without the enclosing quotes), and everything is parsed out as expected in the online debugger just fine. Yet logstash fails to parse the same lines and I am left with a _grokparsefailure and my own no_apache_match tag on every entry from this source in elastic.

Here is an example line:

[06/Oct/2017:14:54:04 -0600] 172.17.0.47 TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 "GET /sadf/sdf.jpg?134 HTTP/1.1" 200

I'm fairly confident in my pipeline confiugration, which is set up as a chain of failovers (there is probably some redundancy in here but I don't see any issues):

filter {

  if "apache" in [tags] {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
      remove_field => "message"
      add_tag => ["combinedapachelog"]
    }
    if "combinedapachelog" not in [tags] {
      grok {
        match => { "message" => "%{COMMONAPACHELOG}" }
        remove_field => "message"
        add_tag => ["commonapachelog"]
      }
      if "commonapachelog" not in [tags] {
        grok {
          match => { "message" => "%{HTTPD_ERRORLOG}" }
          remove_field => "message"
          add_tag => ["httpd_errorlog"]
        }
        if "httpd_errorlog" not in [tags] {
          grok {
            match => { "message" => "\[%{HTTPDATE:timestamp}\] %{IPORHOST:clientip} %{NOTSPACE:tls_version} %{NOTSPACE:crypto_algorithm} \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response}" }
            remove_field => "message"
            add_tag => ["ssl_request_log"]
            tag_on_failure => ["no_apache_match"]
          }
        }
      }
    }
    date {
      match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
      locale => en
      remove_field => ["timestamp"]
    }
    if "no_apache_match" not in [tags] {
      geoip {
        source => "clientip"
        target => "geoip"
        add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
        add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
      }
    }
    mutate {
      convert => {
          "[geoip][coordinates]" => "float"
      }
    }
    useragent {
      source => "agent"
      target => "useragent"
    }
  }

  else if "syslog" in [tags] {
   
    grok {
      match => { "message" => "^<%{POSINT:syslog_pri}>(?:%{TIMESTAMP_ISO8601:syslog_timestamp}|%{SYSLOGTIMESTAMP:syslog_timestamp}) %{SYSLOGHOST:syslog_hostname} %{DATA:program}(?:\[%{NONNEGINT:pid}\]?)?:? %{GREEDYDATA:syslog_message}" } 
    }
    syslog_pri { 
      syslog_pri_field_name => "syslog_pri"
    }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
  
  else if "fiesta" in [tags] {
 
    grok {
      match => { "message" => "%{GREEDYDATA:data}" }
    }
    mutate {
        add_field => { "[location][lat]" => "%{lat}" }
        add_field => { "[location][lon]" => "%{lon}" }
    }
    mutate {
        convert => {
            "[location][lat]" => "float"
            "[location][lon]" => "float"
        }
    }
    date {
      match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
    }
  
  }
  
}

This is my only filter from a single config file. All of my log entries which are from the ss_request_log source, end up being tagged as no_apache_match, meaning they fail on every grok match attempt in my apache chain.

I'm not really sure how to go about debugging this further since the online debugger works.

UPDATE:

So something is very strange. I set up a test like so:

# contents in conf.test
input {
  stdin { }
}
filter {
  grok {
    match => { "message" => "\[%{HTTPDATE:timestamp}\] %{IPORHOST:clientip} %{NOTSPACE:tls_version} %{NOTSPACE:crypto_algorithm} \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response}" }
  }
}
output {
  stdout {
    codec => rubydebug
  }
}

# contents in test.dat
[06/Oct/2017:14:54:04 -0600] 172.17.0.47 TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 "GET /sadf/sdf.jpg?134 HTTP/1.1" 200

# actual start command
/usr/share/logstash/bin/logstash --debug -f /etc/logstash/conf.d/conf.test < /etc/logstash/conf.d/test.dat

and it works! I get the document printed out to stdout with all fields parsed correctly. This is strange.


(Adam Stallard) #2

OK I figured out my problem! It was not my configuration!

In my haste to debug test lines, I realized I copied over entries from the ssl_request_log wrong. Specifically, in my data, the last field was being sent over as a -, and not a response code liek 200.

So being that my regex used %{NUMEBR:response}, I changed that to %{NOTSPACE:response}. Now everything is working!

As an added bonus, I didn't care for the tags I was accumulating through that awkward chain, so I jumbled the match strings together like so:

grok {
  match => {
    "message" => ["%{COMBINEDAPACHELOG}", "%{COMMONAPACHELOG}", ..... ]
  }
}

Looks cleaner and still works.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.