Logstash - parse to fields with grok

TL;DR at bottom

I have a custom generated log file that I made that lists commands users run (along with some other thing). For that, I have got a grok script that successfully parses the message into fields. I've started encountering errors when I was asked to add some really old servers that generated different timestamp.

This is the the log on the new servers:

[2020-07-21 12:59:31] SERVER-DB-230 john:USER=root PWD=/root PID=[22714] CMD="echo test9" Exit=[0] CONNECTION=
[2020-07-21 12:59:33] SERVER-DB-230 john:USER=root PWD=/root PID=[22714] CMD="echo test10" Exit=[0] CONNECTION=
[2020-07-21 12:59:35] SERVER-DB-230 john:USER=root PWD=/root PID=[22714] CMD="clear" Exit=[0] CONNECTION=

This is the log on the old server (different timestamps):

Jul 21 13:02:53 SERVER-DEV-NEW-167 root: USER=root PWD=/root PID=[10638] CMD="echo 2" Exit=[0] CONNECTION=1.2.3.4
Jul 21 13:02:54 SERVER-DEV-NEW-167 root: USER=root PWD=/root PID=[10638] CMD="echo 3" Exit=[0] CONNECTION=1.2.3.4
Jul 21 13:02:56 SERVER-DEV-NEW-167 root: USER=root PWD=/root PID=[10638] CMD="echo 4" Exit=[0] CONNECTION=1.2.3.4

Since these are the syntax of logs I have, I think it's best to have an 'if' statement that says - if grok failed to parse, try parsing it with this grok code. Thing is, even though they're very similar, I wasn't able to make grok parse that data. I'm trying to get it working with the grok debugger but I just can't get it to work.

This is my current .conf in logstash: https://pastebin.com/QZv7zM1x

Does anyone know how to parse the second block of code into fields? And how to make it parse only if the first one failed? Thanks ahead!

TL;DR: need help parsing the second block of logs and have it parsed by grok only on failure

You can have grok match against an array of patterns. For example

grok{
    match => {
        "message" => [
            "^\[(%{TIMESTAMP_ISO8601:sys_timestamp})\]\s(?<Hostname>[0-9a-zA-Z_-]+)\s(?<Logged as>[0-9a-zA-Z_-]+)\:USER=(?<User>[0-9a-zA-Z_-]+)\sPWD=(?<Directory>[0-9a-zA-Z_/-]+)\sPID=\[(?<PID>[0-9]+)\]\sCMD=\"(?<Command>.*)\"\sExit=\[(?<Exit>[0-9]+)\]\sCONNECTION=(?<Connetion>.*)",
            "^%{SYSLOG_TIMESTAMP:sys_timestamp}\s(?<Hostname>[0-9a-zA-Z_-]+)\s(?<Logged as>[0-9a-zA-Z_-]+)\:USER=(?<User>[0-9a-zA-Z_-]+)\sPWD=(?<Directory>[0-9a-zA-Z_/-]+)\sPID=\[(?<PID>[0-9]+)\]\sCMD=\"(?<Command>.*)\"\sExit=\[(?<Exit>[0-9]+)\]\sCONNECTION=(?<Connetion>.*)"
        ]
    }
}

Note that I anchored the patterns with ^ to make them fail more quickly.

You might want to try parsing the timestamp, hostname and user using grok then parse the rest of the line with a kv filter.

Thank you so much for the response.

I'm getting an error in the grok debugger. Also, may I ask about the '^'?

Sorry, there is no underscore in SYSLOGTIMESTAMP.

The reason to anchor the patterns is discussed in this blog post.

My bad, what I tried at first was without the underscore and I just get "no match". It does get results if I only have the first field. Maybe the custom fields don't play well with the pre-made ones?

Also, thanks for the anchor patterns article!

I think you need a \s before USER=

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.