I am parsing Microsoft IIS filter logs and some of the logs have IPV4 and some have IPV6. All those logs have IPV6 with scope id are getting output with which is complete messed up. I have tested this in Grok Debugger (inside Kibana) and on http://grokdebug.herokuapp.com/.
Here is the log output:
2018-09-17 23:09:50 fe80::506d:e3e3:d813:5380%13 POST /mapi/emsmdb/ MailboxId=8df67065-38b2-4548-a250-cb9f8e04e25e@dadco.com 444 Anonymous fe80::506d:e3e3:d813:5380%13 Microsoft+Office/16.0+(Windows+NT+6.1;+Microsoft+Outlook+16.0.4639;+Pro) - 200 0 0 3
And here is my Grok Pattern:
^%{TIMESTAMP_ISO8601:timestamp} %{IP:destination_ip}%{SPACE}%{USERNAME:type} %{NOTSPACE:site}%{SPACE}MailboxId=%{NOTSPACE:request_id} %{NUMBER:port} %{NOTSPACE:host} %{IP:source_ip}%{SPACE}%{NOTSPACE:software} (-)? %{NUMBER:port2} %{NUMBER:num} %{NUMBER:num2} (%{NUMBER:num3})?
This only happens when the IPV6 has scope ID at the end which is '%" sign and some number. More details about scope id can be found here and here.
This is the output in Grok Debugger (inside Kibana)
{
"software": ")",
"num": "0",
"type": "T",
"source_ip": "fe80::506d:e3e3:d813:5380%13 Microsoft+Office/16.0+(Windows+NT+6.1;+Microsoft+Outlook+16.0.4639;+Pro",
"port2": "200",
"site": "/mapi/emsmdb/",
"destination_ip": "fe80::506d:e3e3:d813:5380%13 POS",
"port": "444",
"host": "Anonymous",
"request_id": "8df67065-38b2-4548-a250-cb9f8e04e25e@dadco.com",
"num3": "3",
"timestamp": "2018-09-17 23:09:50",
"num2": "0"
}
As you can see both source_ip and destination_ip has some text which shouldn't be there to begin with. I cannot use just IPV6 filter as some of the logs have IPV4 in them so instead of writing with two different grok patterns I am using IP to match both.
Is this some kind of bug or am I doing something wrong?