I really need help with standard rsyslog parsing

Hi,

env: rsyslog (clients) => syslog-ng (server) => logstash 1.5 => elasticsearch => kibana

I'm trying to parse (match) the standard linux rsyslog message format but I always get _grokparsefailure and are unable to capture the sending hostname instead of the syslog-ng server hostname itself.

An rsyslog standard msg format seen in kibana:

@timestamp	November 3rd 2015, 11:04:02.312
@version	1
_id	  	AVDMza3ZCWCwaLoTIQfe
_index	  	logstash-2015.11.03
_type	  	linux-syslog
host	  	ourCentralSyslog-ngServer
message	  	2015-11-03T11:04:01+01:00 the.ip.address.of.the.sending.server the-sending-server-hostname sshd[10003]: Connection closed by 10.100.8.44 [preauth]
path	  	/path/to/central/syslog.log
tags	  	our-syslog, _grokparsefailure
type	  	linux-syslog

Filters tried:

match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{GREEDYDATA:syslog_message}" }

match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{IPORHOST:ip_address} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }

match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{IPORHOST:ip_address} %{SYSLOGHOST:syslog_hostname} %{GREEDYDATA:syslog_message}" }

I have also tried the example here:
https://www.elastic.co/guide/en/logstash/current/config-examples.html#_processing_syslog_messages

I'd be very happy to have some kind soul give me any advice on how to do the matching.

Many thanks in advance
Tomas

The timestamp in your message doesn't match SYSLOGTIMESTAMP. Try TIMESTAMP_ISO8601 instead.

Thank you very much Magnus, I tested with the below examples on the std message but still no go:

match => { "message" => "%{TIMESTAMP_ISO8601:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
match => { "message" => "%{TIMESTAMP_ISO8601:syslog_timestamp} %{IPORHOST:ip_address} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
match => { "message" => "%{TIMESTAMP_ISO8601:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{GREEDYDATA:syslog_message}" }
match => { "message" => "%{TIMESTAMP_ISO8601:syslog_timestamp} %{IPORHOST:ip_address} %{SYSLOGHOST:syslog_hostname} %{GREEDYDATA:syslog_message}" }

Do you have any other ideas?

I'm sorry but very new to this, both to ELK itself and syslog formats
Thanks
Tomas

Be systematic. Start with %{TIMESTAMP_ISO8601:syslog_timestamp}.* to see if at least that matches. If so, add the next part of the expression. And another. When things stop working you've found which part of the expression that's problematic.

Thanks again, but I must have something really wrong somewhere. It does not even match with:

match => { "message" => "%{TIMESTAMP_ISO8601:syslog_timestamp}.*" }

This is the date/time format we have in the logs from both linux servers and devices:
2015-11-03T13:28:52+01:00

I can't reproduce:

$ cat test.config 
input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
  grok {
    match => {
      "message" => "%{TIMESTAMP_ISO8601:syslog_timestamp}.*"
    }
  }
}
$ echo '2015-11-03T13:28:52+01:00' | /opt/logstash/bin/logstash -f test.config
Logstash startup completed
{
             "message" => "2015-11-03T13:28:52+01:00",
            "@version" => "1",
          "@timestamp" => "2015-11-03T12:33:58.809Z",
                "host" => "lnxolofon",
    "syslog_timestamp" => "2015-11-03T13:28:52+01:00"
}
Logstash shutdown completed
$ /opt/logstash/bin/logstash --version
logstash 1.5.3

I'm very sorry I had a little typo. Now it will match so I get the fields "recievd_at" and "recieved_from" but I still get the tag:

_grokparsefailure

Would that be normal anyway, if I don't do a complete match?

This is the input I have right now:

input {
  file {
    path => ["/path/to/central/syslog.log"]
    type => "linux-syslog"
    tags => [ "our-syslog" ]
  }
}

filter {
  if [type] == "linux-syslog" {
    grok {
      
      match => { "message" => "%{TIMESTAMP_ISO8601:syslog_timestamp}.*" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }


  }
}

output {
  elasticsearch { host => localhost }
}

I don't see why you should get a _grokparsefailure tag in this case. Your configuration should be written so that you don't get that tag under normal circumstances.

Are you sure that's all the configuration you have? How are you invoking Logstash? Any extra files in /etc/logstash/conf.d?

Yes, I have one more configuration in conf.d, for the logstash-forwarder

input {
  lumberjack {
    port => 5000
    type => "logs"
    ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
    ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
  }
}

That's it?!

Well, my minimal example did not result in _grokparsefailure (for you either, right?). Start from there and add additional feature, one by one, until you find what's causing this.

Thanks, will do. You have been very helpful :slight_smile:

Yes, got it now. This is what I ended up with:

filter {
  grok {
    match => {
      "message" => "%{TIMESTAMP_ISO8601:origin_timestamp} %{IPORHOST:ip_address} %{SYSLOGHOST:origin_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}"
    }
        add_field => [ "received_at", "%{@timestamp}" ]
        add_field => [ "received_from", "%{host}" ]

  }

  syslog_pri { }
    date {
      match => [ "origin_timestamp", "ISO8601" ]
    }
        mutate {
                replace => [ "host", "%{origin_hostname}" ]
                replace => [ "message", "%{syslog_message}" ]
        }

}

Thank you