Hello
I just want to figure it out how does JDBC input plugin writes the state value to the last_run_metadata_path
file.
Let me explain:
- We have a SQL Server database and we are ingesting a table that now has a "last_update" column of DATETIME type
- This type does not store TZ informantion
- We are in UTC -03:00
- We have a production and a test environment, both with its own Logstash VM and SQL Server database
- We've tried to keep them both as equal as possible, bute there might have differences in the database configuration, or maybe something on the Logstash VM side (but not the pipeline itself)
- I've wanted to use
jdbc_default_timezone => "America/Sao_Paulo"
, but couldn't because of this:
https://github.com/logstash-plugins/logstash-input-jdbc/issues/121
TZInfo::AmbiguousTime exception when parsing JDBC column
Here is my pipeline configuration:
input {
jdbc {
...
#jdbc_default_timezone => "America/Sao_Paulo"
use_column_value => true
tracking_column => "last_updated" # last_updated_date
tracking_column_type => "timestamp"
last_run_metadata_path => "/etc/logstash/.last_updated"
statement => "
SELECT ...
FROM my_table
WHERE
-- Environment: Test
-- Converts 'last_updated' to UTC so it is comparable to sql_last_value
-- (which in the Test environment is kept in UTC by Logstash)
--dateadd(hh, datediff(hh, getdate(), getutcdate()), last_updated) > :sql_last_value
-- Environment: Production
-- Somehow in this environment Logstash keeps sql_last_value in local time (GMT -03:00)
-- so we should not convert 'last_updated' to UTC
last_updated > :sql_last_value
ORDER BY last_updated asc"
As you can see in the comments, I had an important unexpected behavior between both environments.
Since we are 3 hours behind UTC, when I first ran the pipeline version in the Production environment with the dateadd()
conversion Logstash started ingesting the whole table, because every local time (table column) converted to UTC (add 3 hours) is greater than local time.
Here's how the state is being kept in the Test environemnt:
--- 2020-08-17 13:23:00.000000000 Z
And here's how the state is being kept in the Production environemnt:
--- 2020-08-17 10:23:00.000000000 -03:00
I've tried to take a look at the plugin's code and have found this:
The parameter jdbc_default_timezone
is not set in both environments.
So here is my question:
What made the plugin behave different in both environments?
Thank you for your help.