Investigating data ingest using the JDBC input plugin to pull from a SQL database. Is there a way to configure the query to only pull records based on time? The table I'm pulling from has an updated_on
column. As an example, I'd like to pull all records since the last time that the query ran, the rufus scheduler would then be configured to run every 60s.
Yes, the input persists state. If you search this forum for sql_last_value you will find many examples.
Thanks Badger, I have it configured properly.
In case anyone else stumbles across this, on initial run, sql_last_value
populates with epoch start time. Every subsequent run after that, it populates with the last time the query was run. Logstash logs record the actual query each time it is run and appears in the following format.
[2021-03-03T12:53:00,806][INFO ][logstash.inputs.jdbc ] (0.133250s) USE DBThatB SELECT * FROM TableCarolBaskins WHERE lastUpdateDate > '1970-01-01T00:00:00.000'
[2021-03-03T12:54:00,171][INFO ][logstash.inputs.jdbc ] (0.005637s) USE DBThatB SELECT * FROM TableCarolBaskins WHERE lastUpdateDate > '2021-03-03T18:53:00.634'
Also, worth noting, if you remove the JDBC input from the pipeline and put it back in, it restarts from epoch date.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.