Trivial syslog processing not working

I'm sorry to post here if the answer is finally trivial too, but I've now past countless hours reading everywhere trying to solve it.

I'm basically sending syslog message to Logstash:

input {
        tcp {
                host => "127.0.0.1"
                port => 10514
                codec => "json"
                type => "rsyslog"
        }
}

I'm copying logs on flat files for debug/history:

May 7 14:50:08 core postfix/smtpd[4180]: disconnect from unknown[149.56.0.30]

Logstash is configured to progress syslogs such as described in official configuration examples:

if [type] == "rsyslog" {
                grok {
                        match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
                        add_field => [ "received_at", "%{@timestamp}" ]
                        add_field => [ "received_from", "%{host}" ]
                }
                date {
                        match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
                }
        }

However when inspecting Kibana, the syslog message processing seem to fail:

May 7th 2017, 14:50:08.896

host:        core
severity:        info
@timestamp:        May 7th 2017, 14:50:08.896
port:        34,668
@version:        1
message:        disconnect from unknown[149.56.0.30]
type:        rsyslog
facility:        mail
syslog-tag:        postfix/smtpd[4180]:
timestamp:        May 7th 2017, 14:50:08.000
tags:        _grokparsefailure
_id:        AVvi9cOM40sPYnZsVv-C
_type:        rsyslog
_index:        logstash-2017.05.07
_score:        - 

_grokparsefailure tag obviously says I'm wrong somewhere...
Logs don't say much either.
Could it be default configuration I don't know about, or specific thing I missed?
I'm running Logstash 5.4.0, Elasticsearch 5.4.0, Kibana 5.4.0, on Ubuntu 16.04 x86-64.

What does a raw event look like? What does the result of your config look like if you output to a stdout filter with a rubydebug codec?

Is your input format really JSON?

Do you have any extra files in /etc/logstash/conf.d? Logstash reads all files there.

The syslog line comes directly from my central rsyslog server file output.
Here's my rsyslog config: /etc/rsyslog.d/60-output.conf

*.*                     @@127.0.0.1:10514;json-template
*.*                     -/var/log/siem_debug.log

I don't recall where the json-template comes from but it seems part of the syslog messages are sent in json otherwize logstash would'nt understand at all.

Nope, no other file.

Can you put the whole configfile?

I've enabled the rubydebug codec, but I'm not sure how it exactly works.
Here's one of the json pack I got:

{
      "severity" => "info",
    "@timestamp" => 2017-05-07T19:20:01.560Z,
          "port" => 36736,
          "host" => "core",
      "@version" => "1",
       "message" => " pam_unix(cron:session): session closed for user root",
          "type" => "rsyslog",
      "facility" => "authpriv",
    "syslog-tag" => "CRON[11011]:",
     "timestamp" => "2017-05-07T21:20:01+02:00",
          "tags" => [
        [0] "_grokparsefailure"
    ]
}

Here is the whole logstash.conf file, without comments:

input {
        tcp {
                host => "127.0.0.1"
                port => 10514
                codec => "json"
                type => "rsyslog"
        }
}
filter { 
        if [type] == "rsyslog" {
                # create "syslog_service" and syslog_process_id fields out of syslog-tag content
                grok {
                        #match => { "message" => "%{SYSLOGTIMESTAMP} %{SYSLOGHOST} %{SYSLOGPROG} %{GREEDYDATA:syslog_message}" }
                        match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
                        add_field => [ "received_at", "%{@timestamp}" ]
                        add_field => [ "received_from", "%{host}" ]
                }
                date {
                        match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
                }
        }
}

output {
        if [type] == "rsyslog" {
                elasticsearch { hosts => [ "127.0.0.1:9200" ] }
                stdout { codec => rubydebug }
        }
}

I replaced the tcp input with stdin...i sent the following line and it get no grokparsefailure.:

May 7 14:50:08 core postfix/smtpd[4180]: disconnect from unknown[149.56.0.30]

{
       "received_from" => "jupp-laptop-e530",
          "@timestamp" => 2017-05-07T12:50:08.000Z,
          "syslog_pid" => "4180",
     "syslog_hostname" => "core",
    "syslog_timestamp" => "May 7 14:50:08",
         "received_at" => "2017-05-07T19:30:43.171Z",
            "@version" => "1",
                "host" => "jupp-laptop-e530",
      "syslog_program" => "postfix/smtpd",
             "message" => "May 7 14:50:08 core postfix/smtpd[4180]: disconnect from unknown[149.56.0.30]",
                "type" => "rsyslog",
      "syslog_message" => "disconnect from unknown[149.56.0.30]"
}

Disable your grok an print the incoming message directly to stdout. So you can see whats flying in. I think the problem lies there.

input {
        #stdin {
        #    type => "rsyslog"
        #}
        tcp {
                host => "127.0.0.1"
                port => 10514
                codec => "json"
                type => "rsyslog"
        }
}
filter { 
        if [type] == "rsyslog" {
                # create "syslog_service" and syslog_process_id fields out of syslog-tag content
               # grok {
                #        #match => { "message" => "%{SYSLOGTIMESTAMP} %{SYSLOGHOST} %{SYSLOGPROG} %{GREEDYDATA:syslog_message}" }
                #        match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
                #        add_field => [ "received_at", "%{@timestamp}" ]
                #        add_field => [ "received_from", "%{host}" ]
                #}
                #date {
                #        match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
                #}
        }
}

output {
        if [type] == "rsyslog" {
                #elasticsearch { hosts => [ "127.0.0.1:9200" ] }
                stdout { codec => rubydebug }
        }
}

Seems like this is really json coming in (output from rubydebug in console):

{
      "severity" => "info",
    "@timestamp" => 2017-05-07T19:55:01.332Z,
          "port" => 37000,
          "host" => "core",
      "@version" => "1",
       "message" => " pam_unix(cron:session): session closed for user root",
          "type" => "rsyslog",
      "facility" => "authpriv",
    "syslog-tag" => "CRON[12073]:",
     "timestamp" => "2017-05-07T21:55:01.330590+02:00"
}

But I'm a bit puzzled by the syslog-tag tag

Ok, so you recieve directly a JSON but the message is not that what you expect in your grok.
message = syslog_message. This is the reason why you get the grokparsefailure.

The sender built this json....so the answer is there for the "syslog-tag"

I found this link..maybe it can help you by configuring rsyslog:

*.* @@127.0.0.1:10514;json-template

I'm making an assumption here that json-template does what you say, that it sends json instead of RFC3164 or RFC5424 format. This is OK to do.

My guess right now is that your grok pattern is correctly indicating a failure to parse because your pattern is expecting RFC3164 format for the message field.

To strengthen my hypothesis, the sample JSON out from rsyslog-to-Logstash as you posted is this:

      "severity" => "info",
    "@timestamp" => 2017-05-07T19:20:01.560Z,
          "port" => 36736,
          "host" => "core",
      "@version" => "1",
       "message" => " pam_unix(cron:session): session closed for user root",
          "type" => "rsyslog",
      "facility" => "authpriv",
    "syslog-tag" => "CRON[11011]:",
     "timestamp" => "2017-05-07T21:20:01+02:00",
          "tags" => [
        [0] "_grokparsefailure"
    ]

Your grok is this:

                        match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }

This will never produce the following fields that are existing in your event:

  • severity
  • facility
  • syslog-tag
  • timestamp

The message field, additionally, looks like this in your tcp-input event:

 pam_unix(cron:session): session closed for user root

Notice the lack of timestamp, hostname, syslog program, from this message.

I recommend using netcat or similar to watch the data sent by rsyslog to confirm exactly what form is being sent.

Best i can tell, grok is correctly indicating a parse failure because rsyslog is sending json with the message as shown above.

1 Like

Thanks, this is actually where I got my config :wink:

You nailed it. I was looking for patterns that where not in the message.
Root cause analysis: confusion between message, msg, and what I understood as being the message.
Since my initial need was to split program and pid, I just had to split syslog-tag in two, worked great.
Thanks all of you for supporting me!

@dcn I am happy to hear that you got things sorted out :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.