Can I get a specific part of the referrer in the nginx log when parsing it via logstash

My grok pattern is as below:

 grok{
			match => {
			"message" => ["%{IPV4:IP_address} (?:-|(%{WORD}.%{WORD})) %{USER:ident} \[%{HTTPDATE:message_timestamp}\] \"(?:%{WORD:message_type} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{QS:forwarder}"]
		
			}
		}

Example log:

XXX.XXX.XXX.XX - - [17/Feb/2021:13:20:56 +0000] "GET /secure/useravatar?size=small&avatarId=10123 HTTP/1.1" 200 655 "https://jira.xx.io/browse/MINAUTO-100?focusedCommentId=10099&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel" "XX/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/XX.0.4240.XXX Safari/537.36" "-"

I need to extract "jira.xx.io" from the referrer. Basically anything between "https:// and the first / "

You could try

grok { match => { "referrer" => '"http[s]?://(?<someField>[^/]+/)' } }

This worked fine but it was returning jira.xx.io/ . So I used

grok { match => { "referrer" => '"http[s]?://(?<someField>[^/]+)' } } 

and now I get jira.xx.io
Thanks for your help @Badger

Right, the / should have been outside the parentheses of the capture group: '"http[s]?://(?<someField>[^/]+)/'

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.