Lets take this message as an example:
message:
sub.ourhost.org 10.10.10.10 - - [23/Oct/2017:09:30:23 +0000]
"POST /api/v4/jobs/request HTTP/1.1" 204 0 "-"
"gitlab-runner 10.0.1 (10-0-stable; go1.8.3; linux/amd64)"
If I use the pattern suggested in the 5.6 docs:
grok {
match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} \"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\""] }
}
The first match %{IPORHOST:[nginx][access][remote_ip]}
gets me a remote_ip field, but it discards the host info.
How does IPORHOST work?
If it will sometimes return a hostname, does it make sense to put the results in a remote_ip field?
I decided that I want both so I adapted the grok:
grok {
match => { "message" => ["%{DATA:[nginx][access][request_host]} %{DATA:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{DATA:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} \"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\""] }
}
Note that my nginx conf (from jwilder/nginx-proxy ) is sending both $host and $remote_addr:
The standard nginx log does not include $host.