Hi All,
Apologies if this has been asked before but I couldn't find a similar question in the forum.
I have the following log file line entry:
Tue Apr 05 16:31:39 2016 1 10.102.180.12 37088 d:\directory\Excel filename with a space.xsls a s i r Rest of the log line.
Where the section highlighted can be any of the following combinations of transfer type/security:
a s
a n
b s
b n
essentially the first character represents 'transfer type' (ascii or binary) and the second transfer security (secure, non-secure).
I've been trying to use GROK to obtain the filename/path prior to any of the combinations above, and then continue to parse the rest of the line. Unfortunately the filename can include spaces, and also potentially the "a" character -
as shown in the example above.
I'm struggling to format a regex to use the combinations of transfer type/security to allow me to find the end of the filename section in the log line and select it.
Ideally I'd like something like this as the logstash expression:
%{TIMESTAMP:[@metadata][event_time]} %{NUMBER:transfer_mins} %{IP:remote_host} %{NUMBER:transferred_bytes} %{PATH:path} %{TRANSFER_MODE} %{TRANSFER_SECURITY} %{TRANSFER_STATUS} %{ACCESS MODE} %{GREEDYDATA:the_rest}
Using the following custom patterns:
TIMESTAMP (%{DAY}[\s/-]%{MONTH}[\s/-]%{MONTHDAY}[\s]%{TIME}[\s]%{YEAR})
TRANSFER_MODE ([a|b])
TRANSFER_SECURITY ([s|n])
TRANSFER_STATUS ([i|j|k|o|p|q])
ACCESS_MODE ([a|r])
But this gives me a GROK error (using http://grokdebug.herokuapp.com/) as I don't find a match for %{TRANSFER_MODE} and others.
If I use the following pattern it illustrates the problem with the %PATH capture:
%{TIMESTAMP:[@metadata][event_time]} %{NUMBER:transfer_mins} %{IP:remote_host} %{NUMBER:transferred_bytes} %{PATH:path} %{GREEDYDATA:the_rest}
This finds the following:
{
"[": [
[
"Tue Apr 05 16:31:39 2016"
]
],
"transfer_mins": [
[
"1"
]
],
"remote_host": [
[
"10.102.180.12"
]
],
"transferred_bytes": [
[
"37088"
]
],
"path": [
[
"d:\directory\Excel filename with a space.xsls a s i r Rest of the"
]
],
"the_rest": [
[
"log"
]
]
}
and you can see that the path capture is too greedy.
If any masters-of-grok out there could advise that would me much appreciated
Cheers,
Steve