Grok capture more than one time

Dear users,

I would like to build a grok filter to capture log files but I am stuck in an easy but not for me problem.

One of my log line is like this:

Conn=712914 - GET http://o3p.farm.mediaset.it/farmunica/2017/01/16655_159d549dc7789c/smooth/hd_pol.ism/QualityLevels(64000)/Fragments(audio_ita=44700160000) - 304 - 81.74.234.131:48033 - 0 etag="159d6b43378" - 7

I would like to capture some data inside the url (e.g etag="%{ETAG:etag}") but the url altogether "farmunica/2017/01/16655_159d549dc7789c/smooth/hd_pol.ism/QualityLevels(64000)/Fragments(audio_ita=44700160000)"

Do I need to use two separate grok filter?
Which is the most efficient way?

Regards,
Stefano

@Stefano_Bossi What device is this logging coming out from?

I'm not sure if I understand it right.
You want to filter the path part of the URL?
You could do it like this eg.:

[...] http://%{USER:domain}/%{NOTSPACE:path} [...]

Tried it with Grok debugger:

{
  "domain": [
    [
      "o3p.farm.mediaset.it"
    ]
  ],
  "USERNAME": [
    [
      "o3p.farm.mediaset.it"
    ]
  ],
  "path": [
    [
      "farmunica/2017/01/16655_159d549dc7789c/smooth/hd_pol.ism/QualityLevels(64000)/Fragments(audio_ita=44700160000)"
    ]
  ]
}

You won't really get a USERNAME field in ES, you just get the domain and the path, so you can ignore that.
I used %{USER} because its pattern pretty much reflects domain names too.

Sorry, I wrote my question in a very bad way....
Let me try again:

I have some logs coming from a custom software, nothing commercial.
An example is this one:

[HttpServer] - Conn=86734174 - GET http://o1bp.farm.mediaset.it/farmunica/2018/03/169375_1621a76728b965/hlsnrcenc/l9/401.ts - 304 - 81.74.234.162:13784 - 0 etag="15acca2c9e8" - 8

I wrote a grok filter to extract all the information I need from the log, for example the etag but even some information from the url itself, for example the number 401 which I have named "chunkID".
So far so good, my grok filter works quite well and I have all the info I need.

If you note some of the info I have extracted came from the url "http://o1bp.farm.mediaset.it/farmunica/2018/03/169375_1621a76728b965/hlsnrcenc/l9/401.ts" and here is the problem.
I need the info "inside" the url in different fields but the url itself in a separate filed.

For example, my grok filter is capable to extract these values:

"bytesSend": 534860,
"response": 200,
"syslogtag": "origin:",
"msg": " [HttpServer] - Conn=8610516 - GET http://o1bp.farm.mediaset.it/farmunica/2018/03/169375_1621a76728b965/hlsnrcenc/l9/401.ts - 200 - 81.74.228.136:46556 - 534860 etag=\"1621b45b540\" - 11",
"etag": "1621b45b540",
"@version": "1",
"sysloghost": "ms-origin03",
"appname": "origin",
"protocol": "http",
"tags": [
    "origin-geoip"
],
"path": "/tmp/logExample.log",
"clientPort": 46556,
"nameServer": "o1bp.farm.mediaset.it",
"severity": "INFO",
"hlsLevel": "l9",
"hlsVideoChunk": 401,
"@timestamp": "2018-04-06T09:59:58.144Z",
"connectionId": 8610516,
"geoip": {
    "country_code3": "IT",
    "country_name": "Italy",
    "continent_code": "EU",
    "location": {
        "lon": 12.1097,
        "lat": 43.1479
    },
    "timezone": "Europe/Rome",
    "ip": "81.74.228.136",
    "longitude": 12.1097,
    "country_code2": "IT",
    "latitude": 43.1479
},
"clientIp": "81.74.228.136",
"connDropped": "false",
"facility": "local2",
"uriType": "hls",
"responseTime": 11,
"method": "GET",
"cmsTag": "hlsnrcenc",
"host": "foxs-MacBook-Pro.local"

How could add the relative url "/farmunica/2018/03/169375_1621a76728b965/hlsnrcenc/l9/401.ts" ?
With a second grok filter?

Hope is more clear now.

Anyway, thanks for your help.

Regards,
S.

Well, put the URL path into a field with the first grok, then use a new grok filter on that field to create new fields. This way you retain the field with the URL path and have the new fields too. You are basically doing this with the message field too.

1 Like

Thanks!

This was my idea too.

Regards,
S.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.