I am trying to parse an HTTP header communication with grok/logstash, here is a sample comm:
GET /msdownload/update/v3/static/trustedr/en/authrootstl.cab?07f093d6b2ea8d03 HTTP/1.1
Connection: Keep-Alive
Accept: */*
User-Agent: Microsoft-CryptoAPI/10.0
Host: ctldl.windowsupdate.com
I have the following hacky way of doing it:
grok {
match => {"client_payload" => "(?<request_method>HEAD|PUT|GET|POST|DELETE) %{DATA:request_path} HTTP/%{NUMBER:request_version}::(?<channel_rest>.)User-Agent: %{DATA:request_user-agent}::(?<channel_rest2>.)" }
}
Which gives me a few fields, but my problem relies in the fact that the order of the HTTP headers is not always the same, and sometimes User-Agent: will come before Host: and vice versa....
Basically I want to have the following values extracted:
HTTPMETHOD
URIPATH
HTTPVERSION
USER-AGENT
HOST
Any less hacky way of doing it ?
I am trying with ruby, with this filter:
ruby {
code => '
if event.get("client_payload")
m = event.get("client_payload").match /(?<request_method>^(GET|POST|PUT|HEAD|DELETE)) (?<request_path>.*) HTTP\/(?<request_version>[0-9]{1}\.[0-9]{1})/
if m
event.set("request_path", m[:request_path])
event.set("request_method", m[:request_method])
event.set("request_version", m[:request_version])
end
useragent = event.get("client_payload").match /User-Agent: (?<user_agent>.*?)(::|$)/
if useragent[:user_agent]
event.set("request_user_agent", useragent[:user_agent])
end
host = event.get("client_payload").match /Host: (?<request_host>.*?)(::|$)/
if host[:request_host]
event.set("request_host", host[:request_host])
end
end
'
}
However, I get alot of error in the logstash logs, such as:
[2018-10-22T21:22:58,991][ERROR][logstash.filters.ruby ] Ruby exception occurred: undefined method `[]' for nil:NilClass
Just not certain where to go from here.