Logstash Grok Parcing Based on SYSLOGPROG condition

Hi All,

I am now trying to integate Rsyslog centralized server output to Logstash. Th Rsyslog output contains (apache_access_logs, /var/log/messages , secure log etc and from systemd log sources like local3 to 7), from a number of hosts.

I have created some grok patterns for apache_access, audit logs as well as user activity logs (a custom log), All these are ported from client machines using rsyslog to one rsyslog central server. This central server further output these logs to Logstah.

Now I am trying to get the mixed log parsed by logstash according to the %{SYSLOGPROG} condition. My grok patterns as well as the rsyslog expected output samples are like this : -


APACHE_ACCESS %{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:logsource} %{SYSLOGPROG}: %{IPORHOST:clientip} (?:-|%{USER:ident}) (?:-|%{USER:auth}) [%{HTTPDATE:access_timestamp}] "(?:%{WORD:request_type} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)" %{NUMBER:response} (?:-|%{NUMBER:bytes}) "%{NOTSPACE:request_uri}" "%{GREEDYDATA:User_agent}"

AUDIT %{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:logsource} %{SYSLOGPROG}: type=%{WORD:audit_type} msg=audit(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}): user pid=%{NUMBER:audit_pid} uid=%{NUMBER:audit_uid} auid=%{NUMBER:audit_audid} ses=%{NUMBER:audit_ses} msg=%{GREEDYDATA:audit_message}

ACTIVITY %{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:logsource} %{USER:ssh_user}: %{USER:escalation} %{SYSLOGPROG} %{IPORHOST:clientip} %{GREEDYDATA:activity_message}

And the logs I am expecting from the rsyslog forwarder server is like this order : -

================================
Aug 30 18:33:04 syslogclient01 root: root User-Activity 192.168.1.104 [59879]: touch test [0]

Aug 30 18:33:44 syslogclient01 tag_audit_log: type=CRYPTO_KEY_USER msg=audit(1472562224.404:56190): user pid=60001 uid=0 auid=4294967295 ses=4294967295 msg='op=destroy kind=session fp=? direction=both spid=60002 suid=74 rport=7021 laddr=172.20.20.151 lport=22 exe="/usr/sbin/sshd" hostname=? addr=172.20.20.152 terminal=? res=success'

Aug 30 18:33:12 syslogclient01 root: root User-Activity 192.168.1.104 [59974]: less /var/log/cron [0]

Aug 30 15:08:40 syslogclient01 apache-access: 192.168.1.104 - - [30/Aug/2016:15:08:34 +0530] "GET /_static/classic.css HTTP/1.1" 304 - "http://rsyslogdoc.com/" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36 OPR/38.0.2220.41"

Aug 30 15:08:40 syslogclient01 apache-access: 192.168.1.104 - - [30/Aug/2016:15:08:34 +0530] "GET /_static/pygments.css HTTP/1.1" 304 - "http://rsyslogdoc.com/" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36 OPR/38.0.2220.41"

Aug 30 18:33:43 syslogclient01 root: root User-Activity 192.168.1.104 [59974]: ps ax | grep tail [0]

Aug 30 18:33:44 syslogclient01 tag_audit_log: type=CRYPTO_KEY_USER msg=audit(1472562224.405:56192): user pid=60001 uid=0 auid=4294967295 ses=4294967295 msg='op=destroy kind=server fp=9d:ca:03:95:28:8e:a2:e3:f0:e8:70:fc:4e:b9:11:01 direction=? spid=60001 suid=0 exe="/usr/sbin/sshd" hostname=? addr=172.20.20.152 terminal=? res=success'

Aug 30 18:33:44 syslogclient01 tag_audit_log: type=USER_LOGIN msg=audit(1472562224.405:56193): user pid=60001 uid=0 auid=4294967295 ses=4294967295 msg='op=login acct="root" exe="/usr/sbin/sshd" hostname=? addr=172.20.20.152 terminal=ssh res=failed'

I am looking to get the above grok patterns applied to the incoming log based on the condition %{SYSLOGPROG} , like if %{SYSLOGPROG} == apache-access , then apply pattern APACHE_ACCESS , if it is User-activity then apply ACTIVITY like that.

Is this something feasible ? I tried in google, but no examples worked for me.

The grok filter supports matching against multiple expressions, so you could simply list all expressions in the same grok filter and it'll try them one by one until there's a match.

Another option is to use a conditional to select which grok filter to use:

if [message] =~ /^[A-Za-z] *\d+ \d\d:\d\d:\d\d \S+ apache-access: / {
  # grok filter for apache access
} else if ... {
  ...
}

A third option is to use one grok filter to extract the first token after the hostname, then reference that field in subsequent conditionals:

grok {
  match => {
    "message" => "%{SYSLOGTIMESTAMP} %{SYSLOGHOST} %{SYSLOGPROG:syslogprog}: "
  }
}
if [syslogprog] == "apache-access" {
  # grok filter for apache access
} else if ... {
  ...
}

Hey Magnus,

Thanks for the advise. I need a bit more help.

I have created a sample logstash.conf


input { stdin { } }

filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP} %{SYSLOGHOST} %{SYSLOGPROG:syslogprog}: "
}

if [syslogprog] == "apache-access" {

grok {
match => { "message" => "%{SYSLOGTIMESTAMP} %{SYSLOGHOST} %{SYSLOGPROG:syslogprog}: %{COMBINEDAPACHELOG}" }
}

}
}
}

output {
stdout { codec => rubydebug }
}

This Grok pattern work with apache logs well in grokdebugger.

But when running this I am getting

Error: Expected one of #, => at line 9, column 4 (byte 138) after filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP} %{SYSLOGHOST} %{SYSLOGPROG:syslogprog}: "
}

if {:level=>:error}

I am confused what is wrong here. Please advise.

Thanks,
Bhuvanesh

If you indent your configuration it'll be easy to see that you're not closing the first grok filter before the if [syslogprog] == "apache-access"conditional.

Hey Magnus,

Thank you very much!

That was my mistake. Now I have a working filter!! :slight_smile:

Best Regards,
Bhuvanesh