Day of week changes during the day based on timezone

I have a log from where I want to extract week day ( just the number ).

It works as I expect up to the hour that timezone is reached.

I live at UTC -0300 so before 9 PM week day is OK but after 9 PM week day changes to next week day.

As an example, April 1st 2017 was Saturday before 9 PM but after 9 PM it changed to Sunday.

My grok loos like this

...
mutate {
add_field => {
"timestamp" => "%{year}-%{month}-%{day} %{time} -0300"
}
}

    date {
        locale => "pt_BR"
        timezone => "America/Sao_Paulo"
        match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss.SSS Z" ]
    }

    if "_dateparsefailure" in [tags] {
        drop { }
    }

    mutate {
        add_field => {"[tbwhour]" => "%{+HH}"}
        add_field => {"[tbwweekday]" => "%{+e}"}
    }

...

Log entry starts with ( IP - - Year-Month-Day Hour:Minute:Second.Milis ) no timezone in the LOG

XXX.YYY.ZZZ.WWW - - 2017-04-01 14:07:41.256

My index have this

@timestamp April 1st 2017, 20:11:27.732

tbwweekday 6

@timestamp April 1st 2017, 21:00:44.648

tbwweekday 7

See that for the same day of the month I have 2 different week days.

Any help ?

Thanks in advance.

All times are processed in UTC/GMT time. They are stored in Elasticsearch in UTC/GMT time. Kibana translates this back into whatever local time zone you are in. I understand that this might be inconvenient, especially where you want to be able to point to an index and say, "I know exactly what's in that," but even this behavior is beginning to deviate away from best practices for Elasticsearch. For example, the Rollover API approach to index management is to have your indices not rollover by date, necessarily (though you can do so if desired), but by size, so as to reduce the number of shards on each node in your server.

Thank you for your comments.

What is not clear is that even saving the timestamp in UTC format I can not understand why the same day of the year / day of the month has two different day of the week.

In a given timezone a day must be same week day from 00:00:00 up to 23:59:59 .

What I need is to "lock" the week day for any particular day of the year as I need to graph the access by week days.

If April 1st is Saturday at 6 AM it must be Saturday at 23:59:59 as the day of the year is the same no matter how you saved it.

Considering the actual behaviour ( in my particular case ) when the time of the day goes beyond 9 PM ( UTC -0300 ) week day changes but the timestamp indicates the same day of the year / day of the month as I showed in my example.

I ended up with two different week days for the same day of the year !

Thank you.

April 1, 2017, 20:11:27.732 + 3 hours (since you're UTC -3) is actually April 1, 2017, 23:11:27.732 UTC

and that's why

April 1, 2017, 21:00:44.648 + 3 hours is actually April 2, 2017, 00:00:44.648 UTC. It's a new day, and that's why tbwweekday increases from 6 to 7.

There are other ways to accomplish that. You can probably create a custom histogram with 1 day resolution in Kibana.

Thank you again for your comments.

I already have 2 graphs one for Usage by Time of the Day and another one for Usage by Day Of The Week.

Information is taken from these two fields tbwhour and tbwweekday as you can see from the Grok I post.

tbwhour goes from 0 to 23 and tbwweekday goes from 1 to 7 depend on the day of the week.

Considering the April 1st example, when tbwhour is below 9 PM ( 21 for 24 hour representation ) April 1st is Saturday and when tbwhour is above 9 PM April 1st is Sunday.

The point is : if you take a look at timestamp it is still April 1st not April 2nd when hour is above 9 PM.

@timestamp April 1st 2017, 20:11:27.732

tbwweekday 6

@timestamp April 1st 2017, 21:00:44.648

tbwweekday 7

No matter if the timestamp is saved in UTC or not at this timestamp / timezone April 1st must be Saturday during the entire "day" from 00:00:00 to 23:59:59.

For every timezone a day must have 24 hours.

Even if at 9 PM April 1st changes to April 2nd timestamp must show this as well and it isn't.

Looks like the function that handles Hour and Week Day is not taking timezone into account.

Another possibility is that my Grok is wrong / something is missing about timezone configuration.

Regards.

I can't tell from this where you're seeing @timestamp. If that is in Kibana, then it's UTC time that has been adapted back to your local timezone. If it is in Kibana, then it is exactly what I said it is, and it is indeed April 2nd at 00:00:44.648. If you are seeing that @timestamp value in Logstash, in stdout output with the rubydebug codec, then I would expect to see the timestamp in ISO8601 time. I used most of your configuration to demonstrate this:

input { stdin {} }

filter {
    date {
        match => [ "message" , "yyyy-MM-dd HH:mm:ss.SSS Z" ]
    }

    mutate {
        add_field => {"[tbwhour]" => "%{+HH}"}
        add_field => {"[tbwweekday]" => "%{+e}"}
    }
}

output { stdout { codec => rubydebug } }

According to the date filter you have configured, your timestamp fields look exactly like 2017-04-01 20:11:27.732 -0300 and 2017-04-01 21:00:44.648 -0300, but look what comes out of Logstash when I feed that in:

2017-04-01 20:11:27.732 -0300
{
    "@timestamp" => 2017-04-01T23:11:27.732Z,
       "tbwhour" => "23",
      "@version" => "1",
          "host" => "logstash",
    "tbwweekday" => "6",
       "message" => "2017-04-01 20:11:27.732 -0300"
}
2017-04-01 21:00:44.648 -0300
{
    "@timestamp" => 2017-04-02T00:00:44.648Z,
       "tbwhour" => "00",
      "@version" => "1",
          "host" => "logstash",
    "tbwweekday" => "7",
       "message" => "2017-04-01 21:00:44.648 -0300"
}

This is cut/pasted output behavior illustrating what I was trying to explain earlier. The -0300 time zone means that 21:00 in your time zone is actually 00:00 in UTC, which is what Logstash is reckoning in (as is Elasticsearch). Kibana just translates it back to a local timestamp for your viewing.

Thank you again.

I understand what you mean and I did the same tests you did and results are the same !

Just to let you know that the examples are from Kibana fields : timestamp and tbwweekday

What is driving me crazy is that this LOG file has only April 1st data ( the server creates a LOG for each day ) and I want to extract week day from it.

When I graph the info, part of it is Saturday and part of it is Sunday ( considering what I have in tbwweekday field ) !

The same happens with all other days in these LOGs I mean the next LOG ( April 2nd ) starts on Sunday and ends on Monday and so on.

What I need is to keep the same day of the week ( in tbwweekday filed ) for the entire day as the LOG content is related to just one single day !

Regards,

Hi,

Just to let you know that I've found a workaround ( locking the time of the day ) like that

    mutate {
        add_field => {
        "loc_timestamp" => "%{year}-%{month}-%{day} 12:01:02.003 -0300"
        }
    }

    date {
        locale => "pt_BR"
        timezone => "America/Sao_Paulo"
        match => [ "loc_timestamp" , "yyyy-MM-dd HH:mm:ss.SSS Z" ]
    }

    if "_dateparsefailure" in [tags] {
        drop { }
    }

    mutate {
        add_field => {"[tbwweekday]" => "%{+e}"}
    }

    mutate {
        remove_field => [ "loc_timestamp" ]
    }

So now, April 1st 2017 will always be Saturday from 00:00:00 to 23:59:59 in my example.

I'm not saying that Kibana timestamp / timezone has any problem as it works as expected ( convert to the browser timezone ) but the conversion routine for Day Of the Week changes the week day for the same day of the year depending on the hour of the day.

As I need to graph by Day Of The Week and I know that every server LOG is related to just one specific day of the week this workaround did the trick.

I can't tell you if this is the best practice / code but its working so far.

I really appreciate all your comments and your help.

Thank you.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.