I have configuration file that works pulling data from oracle database. document_id also works and no more duplicate but everytime It runs it does full tablescan and takes forever to pull the data because table has millions of row. I understand if it runs long time first time. But shouldn't it run only short while when it pulls new record only
here is config file
input {
jdbc {
jdbc_validate_connection => true
jdbc_connection_string => "jdbc:oracle:thin:@/DBID"
jdbc_user => "USER"
jdbc_password => "Password"
jdbc_driver_library => "/usr/lib/oracle/12.2/client64/lib/ojdbc8.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
statement => "select JOB, STATUS from JOBS where extract(year from SUBMITTED) > 2015 and JOB > :sql_last_value"
last_run_metadata_path => "/tmp/logstash.lastrun"
use_column_value=>true
tracking_column=>"job"
record_last_run => true
schedule => "20 */5 * * *"
}
}
filter {
}
output {
elasticsearch {
hosts => ["elktst01:9200"]
index => "jobs_all_test1"
document_id => "%{job}"
}
stdout { codec => rubydebug }
}
It pulls 64Million record first time and completed in few hour. but when it starts up second time it keeps running for hours and only adds few hundreds record which are expected.
Only problem is that it runs forever.