Logstash: Log records fail to load in large log file but work when run separately


#1

Filebeat is feeding data into logstash and passing this onto ES. I'm monitoring the load of around 5 million records and notice that I'm getting a lot of _grokparsefailure.

when I drill into the logstash output I can see the following:

15:11:09.999 [[main]>worker1] WARN logstash.filters.date - Failed parsing date from field {:field=>"time", :value=>"2016-10-02_02:59:40.997", :exception=>"Invalid format: "2016-10-02_02:59:40.997" is malformed at "_02:59:40.997"", :config_parsers=>"yyyy-MM-dd_HH:mm:ss.SSS,yyyy-MM-dd HH:mm:ss.SSS", :config_locale=>"default=en_AU"}

The record that caused this to fail is:

2016-10-02_02:59:40.997 [transaction-1] INFO companyA.engine.TransactionHandler - Transaction Completed : SOCKETID=62437,TXNREFERENCE=,CLIENTID=10000031,RESPONSECODE=-1,RESPONSETEXT=TRANSACTION NOT FOUND,DURATION=0,TRANSACTIONTYPE=STATUS[STATUS],INTERFACE=CREDITCARD

I tested this in GrockDebugger and it worked fine.
I loaded it separately in it's own log file and it loaded fine.

The logstash pipeline conf file is:

input {

stdin {

}

beats {
port => "5043"
}
}
filter {
grok {
patterns_dir => ["./patterns"]
match => { "message" => "%{TRANTRACK_DATE:time} %{NOTSPACE} %{NOTSPACE} %{NOTSPACE} - %{NOTSPACE} Completed :\s\w*=(%{WORD:socketval})?,\w*=(%{WORD:txnref})?,\w*=(%{WORD:clientid})?,\w*=(%{NUMBER:respcode})?,\w*=(%{MULTI_WORD:resptext})?,\w*=(%{WORD:duration})?,\w*=(%{TRAN_WITH_SUBTYPE:txntype})?,\w*=(%{WORD:interface})?" }
overwrite => ["message"]
}
if ([message] =~ "Transaction Start") {
drop {}
}
date {
match => ["time","yyyy-MM-dd_HH:mm:ss.SSS","yyyy-MM-dd HH:mm:ss.SSS"]
#timezone => "Australia/Sydney"
}
}

output {
#stdout {codec => json}
elasticsearch {
hosts => [ "x.x.x.x:9200" ]
index => "companyAtrantrack-companyB-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}

Each record that failed as part of the main load has worked in a smaller file. I'm confused.


(Jakob Reiter) #2

The pattern seems ok and it works for me with just using 2016-10-02_02:59:40.997.
Might be worth testing this again with the original file, maybe there are (invisible) control chars in there or something.
The issue seems to be with this part "02:59:40.997"


(Jakob Reiter) #3

Ok, just saw your other post and I think you already answered your question there.

After commenting in the timezone => "Australia/Sydney" (used UTC+1 before) I'm also getting this error and it's because there is no 02:00-03:00 on 2016-10-02 (daylight savings switch over).


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.