Jdbc and consistence of data

elasticforme · September 7, 2018, 6:38pm

I am doing simple test.
connected to oracle db using jdbc and pulled two field name,date total 3000 record got pulled in to logstash. which are same when I check on oracle db.
ran that automatically few time as my config says to run it every 10 min and record stayed same (3000)

Now I deleted one record from that table on oracle. but logstash didn't remove that one record from it's log. how do I achieve this? that when information get change in oracle db it should pull that information at next round of run (which is every 10min)

But if I restart logstash.service new number reflects on my dashboard and get (2999)

here is my config file

input {
jdbc {
jdbc_validate_connection => true
jdbc_connection_string => "jdbc:oracle:thin:@oradev01:1521/CLIENT"
jdbc_user => "user"
jdbc_password => "user"
jdbc_driver_library => "/usr/lib/oracle/12.2/client64/lib/ojdbc8.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
statement => "SELECT name,entered from CLIENTS where entered > :sql_last_value"

    last_run_metadata_path => "/tmp/logstash-Clients.lastrun"
    use_column_value=>true
    tracking_column=>"entered"
    tracking_column_type=>"timestamp"
    clean_run=>true
    record_last_run => true
    schedule => "*/10 * * * *"
   }

}

output {
elasticsearch {
index => "clients-name" }
}

yaauie · September 7, 2018, 10:40pm

What are you attempting to accomplish? The JDBC input is working as configured:

The first time you run it, 3000 rows are emitted from the database, creating 3000 events in Logstash that are persisted in Elasticsearch as 3000 docs. The highest value seen for your tracking_column is saved to a file and is used in future requests as :sql_last_value.
The second time the schedule runs the job, the JDBC input uses your configured tracking_column and statement to only select data whose entered > :sql_last_value, which causes the DB to emit zero rows.
After you have deleted a row, the scheduler runs again, using your configured tracking_column to only select data whose entered > :sql_last_value, which causes the DB to emit zero rows.
When you restart Logstash, the input's clean_run => true directive causes it to ignore the cached tracking_column value, starting over from the beginning. Since there are 2999 rows in the database at this point, all 2999 are emitted.

elasticforme · September 10, 2018, 2:07pm

oh so if I do not have clean_run=>true it will not start over and and I will still have 3000 rows. but infact I need 2999 as new row count because a row has been deleted. how do I achieve that ?

basically I want logstash to reflect what is in database. that is if something is deleted from db then delete from it's record. if something is added in to db then add that in to it's count.

magnusbaeck · September 10, 2018, 2:12pm

Logstash has no way of tracking deleted rows.

elasticforme · September 10, 2018, 4:19pm

I tested clean_run=>false and it is what I needed at this time. Thank you

system · October 8, 2018, 4:19pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Sync new entries in Oracle DB using sql_last_value - Not fetching the new entries Logstash	2	788	July 6, 2017
Jdbc sql_last_value Logstash	2	1759	January 15, 2017
JDBC Timestamp doesn't refresh, leads to large queues Logstash	6	412	August 12, 2021
Logstash jdbc input checkpoint Logstash	8	3849	March 26, 2019
Logstash deleting records from Db Table after running logstash.conf file Logstash	3	559	November 26, 2018

Jdbc and consistence of data

Related topics