I am using syslog plugin to get logs from remote servers. When I parse these logs using 'stdin' or 'file' plugin everything works fine. But when the same logs are provided to logstash through 'syslog' plugins sometime for some of the logs grok filter fails randomly(gets _grokparsefailure error). If I parsed the same failed event again through stdin plugin it parses it properly. I am not getting why it fails randomly. Is it due to syslog plugin or multiline filter? I am using single thread for running my configuration(-w 1). Please can someone help?
Please find my logstash configuration file details below, Also I have given my sample log and pattern file.
Thanks in advance.
Input {
syslog{
port => 514
type => syslog
}
stdin { }
}
filter {
multiline {
pattern => "^.[_.*,[ACTIVE]"
negate => true
what => "previous"
}
grok {
match => [ "message", "%{TY_LOG_MESSAGE}" ]
keep_empty_captures => false
named_captures_only => true
patterns_dir => "/app01/ELK/ty_sys_patterns"
add_tag => [ "parsed_header" ]
tag_on_failure => [ "parse_header_failure" ]
}
}
output {
elasticsearch {
embedded => false
protocol => http
host => "localhost"
}
stdout {
codec => rubydebug
}
}
Sample Log:
<183>Aug 14 09:12:05 uc1ucbtyweb02 weblogicTY [________,[ACTIVE] ExecuteThread: '23' for queue: 'weblogic.kernel.Default (self-tuning)']:D: 14 09:12:04.825: BaseAction.execute: INSIDE BASEACTION X-Forwarded-For ADDRESS null
pattern file:
###################################
Generic patterns, commonly used
###################################
SPACER [\r\n\s]*
Matches as many characters it can, including endlines. This is needed for grabbing a multiline message
MULTILINE_GREEDY %{GREEDYDATA}(?:%{SPACER}%{GREEDYDATA})*
MEMBER_ID_WORD \b[M|m]ember[\s-_]*[I|i]d\b
CLIENT_IP X-Forwarded-For = %{IP:clientip}
HOST_SITE Host = (?<host_site>.*com)
REQUEST_PAGE - - %{URIPATHPARAM:requestpage}
###########################################################
Header message patterns (Weblogic, Log4J, Quartz, etc.)
###########################################################
SYSLOG_TIME_STAMP_EX %{MONTH:slMonth} %{MONTHDAY:slDay} %{HOUR:slHour}:%{MINUTE:slMinute}:%{SECOND:slSecond}
Matches the SYSLOG message <183>0 2015-05-08T11:37:47.944168-05:00 uc1ucbtyweb04 weblogicTY - -
SYSLOG_MESSAGE ^<%{NUMBER:syslognum}>0 %{TIMESTAMP_ISO8601:syslogtime} %{SYSLOGHOST:servername} %{WORD:application} - -
SYSLOG_MESSAGE ^<%{NUMBER:syslognum}>%{SYSLOG_TIME_STAMP:syslogtime} %{SYSLOGHOST:servername} %{WORD:application}
SYSLOG_MESSAGE ^<%{NUMBER:syslognum}>%{MONTH:slMonth} %{MONTHDAY:slDay} %{HOUR:slHour}:%{MINUTE:slMinute}:%{SECOND:slSecond} %{SYSLOGHOST:servername} %{WORD:application}
SYSLOG_MESSAGE ^<%{NUMBER:syslognum}>%{SYSLOG_TIME_STAMP_EX:sysLogTS} %{SYSLOGHOST:servername} %{WORD:application}
Matches "[________," and "[_user123,"
USERNAME_HEADER ^[_*%{WORD:username},
USERNAME_HEADER [_*%{WORD:username},
Matches the WL_WORKER or the Quartz worker
WORKER_INFO (?.*?)]
Matches "[ACTIVE] ExecuteThread: '20' for queue: 'weblogic.kernel.Default (self-tuning)'" or similar
WL_WORKER [ACTIVE] ExecuteThread: '%{INT:threadNum}' for queue: 'weblogic.kernel.Default (self-tuning)'
Matches a single-character logging level designation (ERROR 'E', WARN 'W', INFO 'I', DEBUG 'D', or TRACE 'T')
LOG_LEVEL :(?[EWIDT]):
The date format output on every log message, probably defined in the Log4J configuration
TY_DATE %{MONTHDAY:wlDay} %{HOUR:wlHour}:%{MINUTE:wlMinute}:%{SECOND:wlSecond}.%{INT:wlMillisecond}
CLIENTIP X-Forwarded-For = %{WORD:clientip}
Put these all together to parse all the header information. The rest of the message is put into its own field
TY_LOG_MESSAGE %{SYSLOG_MESSAGE}%{USERNAME_HEADER}%{WORKER_INFO}%{LOG_LEVEL} %{TY_DATE:tyDate}: %{JAVACLASS:logSource}: %{MULTILINE_GREEDY:tyLogMessage}