Hi Logstasher's,
I'm seeing some odd behaviour with Logstash when trying to read files with a multiline filter enabled. I noticed that occasionally I'll see a multiline log entry fail to get detected/tagged as multiline, which results in a grokparsefailure for the 'second' part of the log entry (the first part gets treated as a single line entry, the second doesn't conform to a subsequent grok pattern, hence the failure).
The odd thing is, this occurs when I'm reading a test file of several hundred log lines, When I chop the file size down to only contain the offending log, plus maybe a few other entries (lets say five log entries in all) then the multiline log file gets processed without any problems at all....
This implies some sort of threading/race issue, but I'm afraid I'm pretty stumped at the moment as to what it could be.
Below is my filter, which includes the multiline statement plus the subsequent grok:
filter {
multiline {
patterns_dir => "C:\Applications\Elastic\BWLogReader_Test\logstash_patterns"
pattern => "^%{TIBCO_DATESTAMP} "
negate => true
what => "previous"
}
if [type] == "tibco_bw" {
#Extract main fields
grok {
patterns_dir => "C:\Applications\Elastic\BWLogReader_Test\logstash_patterns"
match => [ "message",
"%{TIBCO_DATESTAMP:log_time}(\sGMT\s%{INT})?\s%{BW_APP_NAME:application}-%{WORD:engine}\s%{WORD:log_level}\s\[%{NOTSPACE:log_type}\]\s%{GREEDYDATA:log_msg}" ]
}
}
output {
stdout {
codec => rubydebug{}
}
}
Below is a sample of the data I see on standard out when the multiline fails (as you can see, Logstash has seen it as two lines):
{
"@version" => "1",
"@timestamp" => "2015-08-12T08:37:47.378Z",
"host" => "L210845",
"path" => "C:/Applications/Elastic/BWLogReader_Test/test_logs/test_bw_multiline.log",
"type" => "tibco_bw",
"log_time" => "2015 Aug 12 09:37:47:378",
"application" => "BW.EAI_BW_PeopleSoftAppServices_3",
"engine" => "PeopleSoftAppServices_3_055",
"log_level" => "Debug",
"log_type" => "BW-User",
"log_msg" => "ResponseMessage Job-50528969 [BusinessServices/Peoplesoft/RenderTSO/Interface/SourceRequest.process/GA_GP_RenderTSOGroup/LogExit]: <root><ns0:ResponseMessage xmlns:ns0=\"http://w
ww.tibco.com/xmlns/ae2xsd/2002/05/ae/Common\"><MESSAGE><?xml version=\"1.0\" encoding=\"UTF-8\"?>",
"job_id" => "50528969",
"tags" => [
[0] "job_id_found"
]
}
{
"message" => "<ResultStatus>\n <STATUS_CODE>0</STATUS_CODE>\n</ResultStatus></MESSAGE></ns0:ResponseMessage><ReplySubject>CRMDomain.STEVE_2</ReplySubject></root> ",
"@version" => "1",
"@timestamp" => "2015-08-14T13:13:02.619Z",
"host" => "L210845",
"path" => "C:/Applications/Elastic/BWLogReader_Test/test_logs/test_bw_multiline.log",
"type" => "tibco_bw",
"tags" => [
[0] "multiline",
[1] "_grokparsefailure",
[2] "job_id_not_found"
],
"job_id" => "None"
}
...and here is the same record as processed by Logstash when I've made the input file much smaller (you can see, now nicely recognised as a multiline input and transformed into a single Logstash event):
{
"@version" => "1",
"@timestamp" => "2015-08-12T08:37:47.378Z",
"host" => "L210845",
"path" => "C:/Applications/Elastic/BWLogReader_Test/test_logs/test_multiline_1.log",
"type" => "tibco_bw",
"tags" => [
[0] "multiline",
[1] "job_id_found"
],
"log_time" => "2015 Aug 12 09:37:47:378",
"application" => "BW.EAI_BW_PeopleSoftAppServices_3",
"engine" => "PeopleSoftAppServices_3_055",
"log_level" => "Debug",
"log_type" => "BW-User",
"log_msg" => "ResponseMessage Job-50528969 [BusinessServices/Peoplesoft/RenderTSO/Interface/SourceRequest.process/GA_GP_RenderTSOGroup/LogExit]: <root><ns0:ResponseMessage xmlns:ns0=\"http://w
ww.tibco.com/xmlns/ae2xsd/2002/05/ae/Common\"><MESSAGE><?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<ResultStatus>\n <STATUS_CODE>0</STATUS_CODE>\n</ResultStatus></MESS
AGE></ns0:ResponseMessage><ReplySubject>CRMDomain.STEVE_2</ReplySubject></root> ",
"job_id" => "50528969"
}
If anyone can shed any light on this, that would be really helpful. At the moment it's driving me crazy...
Cheers,
Steve