Hi,
I have some questions regarding the internals of logstash when using the jdbc input plugin.
I would like to mention that my current logstash configuration is working properly, but I want to know how the jdbc_page_size and the :sql_last_value work together...
Say, for instance, that in my logstash pipeline I have the following configuration:
statement => select * from table t where t.id > :sql_last_value order by t.id
jdbc_paging_enabled => true
jdbc_page_size => 10000
use_column_value => true
tracking_column => "id"
In the documentation it is stated that jdbc_paging_enabled causes the sql statement to be broken in multiple queries...
Say I have 100.000 records in the DB, how would logstash execute the statements?
1st:
select * from table t where t.id > 0 order by t.id FETCH NEXT 10000 ROWS ONLY;
(set the last id to :sql_last_value. eg. 10000)
2nd:
select * from table t where t.id > 10000 order by t.id FETCH NEXT 10000 ROWS ONLY;
(set the last id to :sql_last_value. eg. 20000)
3nd:
select * from table t where t.id > 20000 order by t.id FETCH NEXT 10000 ROWS ONLY;
(set the last id to :sql_last_value. eg. 30000)
etc
Is this accurate?
In the source code (https://github.com/chaodhib/logstash-input-jdbc/blob/4e8d1d7fbfd01fec5840530222f982b306e8bb22/lib/logstash/plugin_mixins/jdbc.rb) it is written: "This will cause a sql statement to be broken up into multiple queries. Each query will use limits and offsets to collectively retrieve the full result-set."
In Oracle there is no limit or offset so how does it work? It uses FETCH like I mentioned before or ROWNUM?
when is the value of :sql_last_value written to the metadata file?