and now I´m collection the informations with the right severity/facility.
The approach is really flexible, but as discussed inside the original post, this configuration have some limits and can create heavy performance problems.
I´m shure that should exist an easiest solution but until now I have still to find them.
The biggest performance problems in grok patterns occur with unanchored patterns, because when the parser encounters a mismatch, it simply skips a character and tries all over again. This is especially wasteful if the beginning of the pattern is likely to match arbitrary bits in the middle of malformed inputs.
By anchoring the pattern to the beginning of the string, we can keep partial-matches from wasting resources.
# parse both RFC 3164 and 5424
grok {
patterns_dir => "/etc/logstash/pattern.d"
match => [ "message", "^%{SYSLOG}" ]
tag_on_failure => [ "_grokparsefailure_syslog" ]
}
I'm a little surprised that the pipeline does so much work to pull out the priority at the beginning before the grok pattern is able to pull it out. Is there a reason for this? From what I can tell in the grok patterns, it should be matching the same bits from either RFC's compatible messages.
.....
Of course, 'syslog' is a very muddy term. By default, this input only
supports RFC3164 syslog with some small modifications.
However, some non-standard syslog formats can be read and parsed if
a functional grok_pattern is provided. The date format is still only
allowed to be RFC3164 style or ISO8601.
.....
Working with rsyslog templates is possible to send out logs on RFC3164 format.
for what i can test (at the moment the ELK cluster is still not on production) "..anchoring the pattern to the beginning of the string..." works good and improve the performances.
Regarding the pipeline I´m still testing different rsyslog formats but using the basic approach ( sample configuration ) I´m still receiving messages with wrong severity/facility.
I´ve tested all the standard rsyslog templates on rsyslog forwarding rule with the result to win a "_grokparsefailure" when I use a different log format
# RSYSLOG_TraditionalFileFormat - the "old style" default log file format with low-precision timestamps
# RSYSLOG_FileFormat - a modern-style logfile format similar to TraditionalFileFormat,
# buth with high-precision timestamps and timezone information
# RSYSLOG_TraditionalForwardFormat - the traditional forwarding format with low-precision timestamps.
# Most useful if you send messages to other syslogd's or rsyslogd below version 3.12.5.
# RSYSLOG_ForwardFormat - a new high-precision forwarding format very similar to the traditional one, but with
# high-precision timestamps and timezone information. Recommended to be used when sending
# messages to rsyslog 3.12.5 or above.
# RSYSLOG_SyslogProtocol23Format - the format specified in IETF's internet-draft ietf-syslog-protocol-23, which is assumed
# to be come the new syslog standard RFC. This format includes several improvements.
# The rsyslog message parser understands this format, so you can use it together with all
# relatively recent versions of rsyslog. Other syslogd's may get hopelessly confused if
# receiving that format, so check before you use it. Note that the format is unlikely to
# change when the final RFC comes out, but this may happen.
Until now no. My approach was based on rsyslog configuration because the Linux rsyslog v,8 (Fedora/Redhat/Centos, Debian/Ubuntu) is one of most common linux logging format and I´m a little bit suprised that the logstash syslog plugin have problems to decode it without a long pipeline and a complex pattern.
The solution must be easy but until now I didn´t found it.
Anyway.... a cup of coffe, VMs power on and I´m going to test the grok contructor.
Once again thanks for your support @yaauie and have a nice day.
After many test i started thinking that maybe CentOS7.5 logs have a specical format, then we created two new servers: 1 Debian 9.4x64 and one Ubuntu 16.04LTSx64.
Well, after the installatio and rsyslog configuration nothing changed. With the chain suggeted on the web we got always "severity code 5 and facility code 1".
Using the big pattern and the long processing chain we got correct severity and facility. Looks strange that 3 of the most used Linux distro have the same problem.
Maybe I still have to understand something but after many hours of test we suppose that there is a problem with the syslog input plugin or maybe there is a mistake on the documentation Configuration Examples.
p.s.
Maybe is a coincidence but also on the output example I found "severity code 5 and facility code 1"...
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.