Filebeat is feeding data into logstash and passing this onto ES. I'm monitoring the load of around 5 million records and notice that I'm getting a lot of _grokparsefailure.
when I drill into the logstash output I can see the following:
15:11:09.999 [[main]>worker1] WARN logstash.filters.date - Failed parsing date from field {:field=>"time", :value=>"2016-10-02_02:59:40.997", :exception=>"Invalid format: "2016-10-02_02:59:40.997" is malformed at "_02:59:40.997"", :config_parsers=>"yyyy-MM-dd_HH:mm:ss.SSS,yyyy-MM-dd HH:mm:ss.SSS", :config_locale=>"default=en_AU"}
The record that caused this to fail is:
2016-10-02_02:59:40.997 [transaction-1] INFO companyA.engine.TransactionHandler - Transaction Completed : SOCKETID=62437,TXNREFERENCE=,CLIENTID=10000031,RESPONSECODE=-1,RESPONSETEXT=TRANSACTION NOT FOUND,DURATION=0,TRANSACTIONTYPE=STATUS[STATUS],INTERFACE=CREDITCARD
I tested this in GrockDebugger and it worked fine.
I loaded it separately in it's own log file and it loaded fine.
The logstash pipeline conf file is:
input {
stdin {
}
beats {
port => "5043"
}
}
filter {
grok {
patterns_dir => ["./patterns"]
match => { "message" => "%{TRANTRACK_DATE:time} %{NOTSPACE} %{NOTSPACE} %{NOTSPACE} - %{NOTSPACE} Completed :\s\w*=(%{WORD:socketval})?,\w*=(%{WORD:txnref})?,\w*=(%{WORD:clientid})?,\w*=(%{NUMBER:respcode})?,\w*=(%{MULTI_WORD:resptext})?,\w*=(%{WORD:duration})?,\w*=(%{TRAN_WITH_SUBTYPE:txntype})?,\w*=(%{WORD:interface})?" }
overwrite => ["message"]
}
if ([message] =~ "Transaction Start") {
drop {}
}
date {
match => ["time","yyyy-MM-dd_HH:mm:ss.SSS","yyyy-MM-dd HH:mm:ss.SSS"]
#timezone => "Australia/Sydney"
}
}
output {
#stdout {codec => json}
elasticsearch {
hosts => [ "x.x.x.x:9200" ]
index => "companyAtrantrack-companyB-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}
Each record that failed as part of the main load has worked in a smaller file. I'm confused.