Split grok pattern


#1

Hey,

I have just recently started using logstash and am still getting my head around it.
Is it possible to run a grok filter to match a pattern then split it into two fields or get other values out of it?

For example I have been testing my patterns on: https://grokdebug.herokuapp.com/
If i use the debugger to match say http://www.google.com to %{URI:test}
Is it possible to also get the HOSTNAME out of URI?
URI > URIHOST > HOSTNAME
URI %{URIPROTO}://(?:%{USER}(?::[^@]*)?@)?(?:%{URIHOST})?(?:%{URIPATHPARAM})?

On the debugger I get the output.

{
"test": [
[
"http://google.com"
]
],
"URIPROTO": [
[
"http"
]
],
"USER": [
[
null
]
],
"USERNAME": [
[
null
]
],
"URIHOST": [
[
"google.com"
]
],
"IPORHOST": [
[
"google.com"
]
],
"HOSTNAME": [
[
"google.com"
]
],
"IP": [
[
null
]
],
"IPV6": [
[
null
]
],
"IPV4": [
[
null
]
],
"port": [
[
null
]
],
"URIPATHPARAM": [
[
null
]
],
"URIPATH": [
[
null
]
],
"URIPARAM": [
[
null
]
]
}

I'm interested in my field "test" but also would like the field HOSTNAME out of it. Obviously the value is there but I'm not sure how to get it as a field into elasticsearch.


(Pemontto) #2

You can create your own pattern based off the existing URI pattern and make it capture those fields e.g.

The original pattern:

%{URIPROTO}://(?:%{USER}(?::[^@]*)?@)?(?:%{URIHOST})?(?:%{URIPATHPARAM})?

Customised pattern:

%{URIPROTO:uri_proto}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:uri_host})?(?:%{URIPATHPARAM:uri_param})?

This will extract all the fields out like so

{
  "uri": [
    "http://www.google.com"
  ],
  "uri_proto": [
    "http"
  ],
  "user": [
    null
  ],
  "uri_host": [
    "www.google.com"
  ],
  "port": [
    null
  ],
  "uri_param": [
    null
  ]
}

You can then call this MY_URI or similar and add it to a custom patterns file, or just use the raw pattern in your config in place of %{URI}. You can also test custom patterns in the debugger by selecting the "Add custom patterns" checkbox.


#3

Thanks for the info.
After reading through that it actually makes perfect sense to do this. I managed to make my own filter for this and its working well!

Cheers.


(system) #4