Extract subdomain from referrer in logstash


(Search) #1

I use logstash to send apache log to elasticsearch. In the file config of logstash I have a filter
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}

In the list of fields of the obtained data I have the field "referrer" of which I would like to extract the subdomain. If for example:
referrer = "http://mysubdomain.mydomain.com/controller/action"
then i need to extract the string "mysubdomain" and assign it to a new "subdomain" field.


#2

Use grok and match against

"https?://(?<subdomain>[^/]+)\.[^\./]+\.[^\./]+/"

Updated to name subdomain instead of domain.


(Search) #3

I'm new to logstash. If you can tell me how to complete my grok it will be nice of you :slight_smile:
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
grok {
match => ???
}


#4
grok { match => [ "referrer", "https?://(?<subdomain>[^/]+)\.[^\./]+\.[^\./]+/" ] }

should do it.


(Search) #5

Don't works for me. The referrer is variable:
referrer = "http://mysubdomain.mydomain.com/controller/action"
referrer = "http://mysubdomain.mydomain.com/controller/action/param..."
referrer = "http://mysubdomain.mydomain.com"
And I can have https or http.


#6

It already handles https. If the trailing / is optional then use

grok { match => [ "referrer", "https?://(?<subdomain>[^/]+)\.[^\./]+\.[^\./]+(/|$)" ] }

(Search) #7

It works. Thanks :slight_smile:


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.