Logstash date filtering and event @timestamp

I'm pulling data in from an ERD database, in short ticket records which can be updated so I'm using sql_last_date on an "updated_date" field to check for new and updated data, and assigning a PK identifier to the ID to uniquely identify the tickets to avoid duplicates.

However when testing I notice by default @timestamp is being updated with whenever the data is imported or updated resulting in duplicate entries. For example I'll have 2 events with ID=123, one with @timestamp 1-Jun and another with @timestamp 2-Jun.

So then I seek to customise @timestamp with a field in my database called "created_date" which won't change. This field is a timestamp data type from an Oracle database. However I keep getting parsing errors when I try to assign it to timestamp.

I've tried querying the field and converting so it converts to a particular format (I tried ISO 8601 at first but couldn't get clear information on how to get output on this format with a query on an Oracle database) so I use this basic format:

Part of Oracle SQL SELECT Statement:
TO_CHAR(CREATE_DATE, 'DD-MON-YYYY HH24:mm:SS') AS TIMESTAMP_TEMP

Example of result of this format:
25-APR-2017 10:04:02

This works fine when I test, from the perspective that I can see in my timestamp_temp field it's coming in as the expected format (even though I'm seeing a parse error at that point so it just sees it as a string and not a date in Kibana).

Anyway, reading through a lot of posts I try getting back to basics by doing the following test, but it's still coming back with unexpected results I can't understand. I'm running the following from the command line:

echo '25-APR-2017 10:04:02' | bin/logstash --path.settings /etc/logstash/ -e 'input { stdin { } } output { stdout { codec => rubydebug } } filter { date { match => [ "message", "DD-MMM-YYYY HH:mm:ss" ] } }'

Resulting in the following:

root@ccbu-reporting:/usr/share/logstash# echo '25-APR-2017 10:04:02' | bin/logstash --path.settings /etc/logstash/ -e 'input { stdin { } } output { stdout { codec => rubydebug } } filter { date { match => [ "message", "DD-MMM-YYYY HH:mm:ss" ] } }'
Sending Logstash's logs to /var/log/logstash which is now configured via log4j2.properties
{
    "@timestamp" => 2017-01-24T23:04:02.000Z,
      "@version" => "1",
          "host" => "ccbu-reporting",
       "message" => "25-APR-2017 10:04:02"
}
{"@timestamp":"2017-01-24T23:04:02.000Z","@version":"1","host":"ccbu-reporting","message":"25-APR-2017 10:04:02"}

I don't understand why it's turning 25-Apr-2017 into 24-Jan-2017, my system time is accurate and currently 7-Jun-2017 in case that is relevant. I've tried to simplify things and just provide relevant information but happy to provide any further details as needed. Thanks for any suggestions in advance.

1 Like

Use "dd" instead of "DD". The latter means "day of year", not "day of month".

1 Like

Thank you! How silly of me to miss something simple like that thank you for pointing it out. This was definitely the issue with the test case.

My 'test case' is now working, I'm making changes to put this in my main configuration now and still running into some issues but let me check further now and report back by tomorrow on this to see if I can fix my original problem.

Thank you for your fast reply on this.

Thanks again for the help here. I just wanted to report back in case someone else ran into a similar issue in case it is helpful to them.

While my original query was a simple formatting issue, I ran into another problem where using day-month-year was causing Elasticsearch to treat any of the date fields as "text" which resulted in the following error from Kibana: "Visualize: Fielddata is disabled on text fields by default.”

So instead I changed my queries of the Oracle database to be ISO 8601 although because time is in UTC I didn't use time zone information. There may be a better way to do this, but anyway this is how I ended up doing it:

Select statement for Oracle database (irrelevant fields omitted for the purpose of this issue):

TO_CHAR(table.create_date, 'YYYY-MM-DD') || 'T' || TO_CHAR(table.create_date, 'HH24:MI:SS') AS create_date
FROM table

And then in my LogStash configuration I had the following date filter to match the formatting:

filter {
        date {
                locale => "EN"
                timezone => "UTC"
                match => [ "create_date", "yyyy-MM-dd'T'HH:mm:ss" ]
                target => "@timestamp"
        }
}

If you need to use milliseconds (which I didn't need to use), you need to match the number of digits exactly otherwise you'll get a parse error (I did experiment with this initially but ditched them because seconds is granular enough for my purposes, but if you need to use milliseconds, for example from a timestamp formatted field in Oracle DB which defaults to a precision of 6 digits, you can use the following formats:

In your query:

"YYYY-MM-DD HH24:MI:SSxFF6"

In LogStash (note: LogStash will only use precision to 3 digits but it still needs to match exactly the same number of digits else pare will fail):

"yyyy-MM-dd'T'HH:mm:ss.SSSSSS"

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.