NGINX access grok pattern

gotjoshua · October 23, 2017, 10:07am

Lets take this message as an example:

message:
sub.ourhost.org 10.10.10.10 - - [23/Oct/2017:09:30:23 +0000] 
"POST /api/v4/jobs/request HTTP/1.1" 204 0 "-" 
"gitlab-runner 10.0.1 (10-0-stable; go1.8.3; linux/amd64)"

If I use the pattern suggested in the 5.6 docs:

grok {
  match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} \"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\""] }
}

The first match %{IPORHOST:[nginx][access][remote_ip]} gets me a remote_ip field, but it discards the host info.

How does IPORHOST work?

If it will sometimes return a hostname, does it make sense to put the results in a remote_ip field?

I decided that I want both so I adapted the grok:

grok {
  match => { "message" => ["%{DATA:[nginx][access][request_host]} %{DATA:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{DATA:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} \"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\""] }
}

Note that my nginx conf (from jwilder/nginx-proxy ) is sending both $host and $remote_addr:

github.com

jwilder/nginx-proxy/blob/master/nginx.tmpl#L55-57


{{ if (exists "/etc/nginx/dhparam/dhparam.pem") }}
ssl_dhparam /etc/nginx/dhparam/dhparam.pem;
{{ end }}


# Set appropriate X-Forwarded-Ssl header
map $scheme $proxy_x_forwarded_ssl {
default off;
https on;
}


gzip_types text/plain text/css application/javascript application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;


log_format vhost '$host $remote_addr - $remote_user [$time_local] '
               '"$request" $status $body_bytes_sent '
               '"$http_referer" "$http_user_agent"';


access_log off;


{{ if $.Env.RESOLVERS }}
resolver {{ $.Env.RESOLVERS }};
{{ end }}

The standard nginx log does not include $host.

magnusbaeck · October 23, 2017, 1:31pm

The first match %{IPORHOST:[nginx][access][remote_ip]} gets me a remote_ip field, but it discards the host info.

It doesn't discard anything. It matches an IP address or hostname. In your example that IP or hostname needs to be followed by " - ", so only 10.10.10.10 will match the IPORHOST pattern. If you want to get sub.ourhost.org into a field you need to say so:

^%{IPORHOST:[nginx][access][request_host]} %{IPORHOST:[nginx][access][remote_ip]} - ...

system · November 20, 2017, 1:31pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Nginx access log using grok filter Logstash	2	3040	March 21, 2023
How to use grok pattern to separate nginx access log fields !? Logstash	4	5500	August 12, 2017
Logstash grok pattern for custom nginx access log Logstash	1	690	August 15, 2022
Grok pattern for custome logs Logstash	1	897	September 28, 2018
Please help with pattern Logstash	5	616	October 12, 2017

NGINX access grok pattern

Related topics