Logstash fails to load indian log records due to daylight savings change on host machine


#1

I'm not sure if there's a way around this one.

I have a concatenated year long log file provided by an indian client who do not yet use logstash. I have set up a test box in Sydney Australia to parse their logs to analyse their transaction volume.

It loads mostly successfully except for any logs between 0200-0300 2/10/2016, all these fail. I have identified this is almost certainly due to NSW Australian daylight saving being found at that time, we skip forward an hour. The logs were on a server in india.

I've tried setting the timezone as follows:

    date {
	match => ["time","yyyy-MM-dd_HH:mm:ss.SSS","yyyy-MM-dd HH:mm:ss.SSS"]
	timezone => "Asia/Kolkata"
}

but I'm still seeing the issue. I'm missing 1450 records, not earth shattering but annoying.


Logstash: Log records fail to load in large log file but work when run separately
(Jakob Reiter) #2

Can you post an example that is failing?
I just tried and it seems to work for me:

{
    "@timestamp" => 2016-10-01T21:29:40.997Z,
          "port" => 46418,
      "@version" => "1",
          "host" => "127.0.0.1",
          "time" => "2016-10-02_02:59:40.997",
       "message" => "2016-10-02_02:59:40.997",
          "tags" => []
}

What LS and Java version are you using?


#3

Thanks for your reply jakommo, my set up is as follows:

logstash-5.0.1
Java 1.8.0_111
Current system server timezone is AEDT, daylight saving flipped 0200 02/10/2016

#logstash pipeline configuration

filter {
	grok {
	   patterns_dir => ["./patterns"]
	   match => { "message" => "%{TRANTRACK_DATE:time} %{NOTSPACE} %{NOTSPACE}  %{NOTSPACE} - %{NOTSPACE} Completed :\s\w*=(%{WORD:socketval})?,\w*=(%{WORD:txnref})?,\w*=(%{WORD:clientid})?,\w*=(%{NUMBER:respcode})?,\w*=(%{MULTI_WORD:resptext})?,\w*=(%{WORD:duration})?,\w*=(%{TRAN_WITH_SUBTYPE:txntype})?,\w*=(%{WORD:interface})?" }
	   overwrite => ["message"]	
	}
	if ([message] =~ "Transaction Start") {
      drop {}
    }
    date {
    	match => ["time","yyyy-MM-dd_HH:mm:ss.SSS","yyyy-MM-dd HH:mm:ss.SSS"]
		timezone => "Asia/Kolkata"
    }
}

#./patterns directory

TRANTRACK_DATE %{YEAR}[/-]%{MONTHNUM}[/-]%{MONTHDAY}[\s_]%{HOUR}:%{MINUTE}:%{SECOND}[.]\d\d\d
MULTI_WORD [\b\w+|\W\b\s]+
TRAN_WITH_SUBTYPE [\b\w+\[\w+\]\b]+
NUMBER_OR_WORD (%{NUMBER}|%{WORD})


#Record that fails in logstash but parse correctly in grokdebugger

2016-10-02_02:00:01.632 [transaction-1] INFO  companyA.engine.TransactionHandler - <TransactionTracker>Transaction Completed : SOCKETID=61915,TXNREFERENCE=1000000001974928,CLIENTID=10000046,RESPONSECODE=05,RESPONSETEXT=DO NOT HONOUR,DURATION=15,TRANSACTIONTYPE=PURCHASE[STATUS],INTERFACE=CREDITCARD



#example of the failure output
16:19:42.909 [[main]>worker7] WARN  logstash.filters.date - Failed parsing date from field {:field=>"time", :value=>"2016-10-02_02:59:59.999", :exception=>"Invalid format: \"2016-10-02_02:59:59.999\"
is malformed at \"_02:59:59.999\"", :config_parsers=>"yyyy-MM-dd_HH:mm:ss.SSS,yyyy-MM-dd HH:mm:ss.SSS", :config_locale=>"default=en_AU"}

(Jakob Reiter) #4

Hmm, I tried with AEDT on a Centos7 machine using LS 5.0.1 and Java 8 u 111, but it still works for me.

# timedatectl 
      Local time: Thu 2016-12-22 03:08:14 AEDT
  Universal time: Wed 2016-12-21 16:08:14 UTC
        RTC time: n/a
       Time zone: Australia/Sydney (AEDT, +1100)
     NTP enabled: n/a
NTP synchronized: yes
 RTC in local TZ: no
      DST active: yes
 Last DST change: DST began at
                  Sun 2016-10-02 01:59:59 AEST
                  Sun 2016-10-02 03:00:00 AEDT
 Next DST change: DST ends (the clock jumps one hour backwards) at
                  Sun 2017-04-02 02:59:59 AEDT
                  Sun 2017-04-02 02:00:00 AEST

# rpm -qa | grep openjdk
java-1.8.0-openjdk-headless-1.8.0.111-2.b15.el7_3.x86_64

# logstash-5.0.1/bin/logstash -r -w1 -f ls.conf 
Sending Logstash's logs to /root/logstash-5.0.1/logs which is now configured via log4j2.properties
[2016-12-22T03:09:01,800][INFO ][logstash.inputs.tcp      ] Starting tcp input listener {:address=>"0.0.0.0:5014"}
[2016-12-22T03:09:02,309][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>125}
[2016-12-22T03:09:02,310][INFO ][logstash.pipeline        ] Pipeline main started
[2016-12-22T03:09:02,495][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
{
    "@timestamp" => 2016-10-01T21:29:40.997Z,
          "port" => 57972,
      "@version" => "1",
          "host" => "10.52.101.1",
          "time" => "2016-10-02_02:59:40.997",
       "message" => "2016-10-02_02:59:40.997",
          "tags" => []
}

I just changed the grok to
match => { "message" => "%{TRANTRACK_DATE:time}"}
to accept just the timestamp.

Just to make sure, you are having this problem when sending/reading the value "2016-10-02_02:59:40.997"?
This is the input, not the one it is converted to in UTC, right?


#5

Yes this date is being fed in from the log file.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.