Hi all,
I am working with Logstash 7.13.0 & ES 7.13.0 to populate and maintain an index with records from an Oracle DB. The maintenance is achieved by running an update Logstash service every minute. The condition to be updated / inserted in the index is that the timestamp of the DB record is >= timestamp last saved by logstash in the fille last_run_metadata_path
.
The service is running without error, so now I'd like to ensure that I lose no records during service due to, for example, a server outage.
I see two methods to accomplish this:
- Persistent Queues, which are unfortunately inefficient due to disk writes.
- When the server is back online, restart my Logstash service after refreshing the timestamp in
last_run_metadata_path
to a time before the server outage. Then all records during / after the outage will be updated / created in the index.
I see option 2 as the more efficient as server issues should be seldom occurring. According to the Logstash Docs: * Upon query execution, this file ( last_run_metadata_path
) will be updated with the current value of sql_last_value
*. My question is: when exactly is this value updated? Is it after records have been stored in the in-memory queue? Or after all records in the in-memory queue have been successfully inserted in the index?
For reference, the input section of my logstash configuration:
input {
jdbc {
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_driver_library => "path\to\ojdbc8.jar"
jdbc_connection_string => "jdbc:oracle:thin:@<DB_IP>:<DB_PORT>:<DB_SERVICE>"
jdbc_user => "<USER>"
jdbc_password => "<PASS>"
statement_filepath => "path\to\update.sql"
schedule => "* * * * *"
last_run_metadata_path => "path\to\.logstash_jdbc_update"
}
}
Thank you for your assistance!
Chris