LogStash - Issue with sql_last_value and last_run_metadata_path

Hello,

I'm new to ELK and I'm currently struggling with some setup - maybe I missed a point.
I have set up my Logstash to parse my DB (MySql), and I've got 2 cases:

  1. "Unforeseen maintenance"
    -> In this case, I would need to stop/start Logstash manually without purging my DB, so when it starts, I need it to track the records that have not been ingested yet.

  2. "Planned maintenance"
    -> In this case, I make a TRUNCATE on my DB, thus getting back to 0 with my Ids. I thought that by altering the file set in "last_run_metadata_path", Logstash would start using this new value that I give it. Though, it looks like it keeps the old Id in memory. Hence, my new entries are lost and never ingested.

Here is my setup:

input {
  jdbc {
    codec => plain { charset=>"UTF-8" }
    clean_run => false
    jdbc_driver_library => "C:\com.mysql.jdbc_5.1.5.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/talend_logs?characterEncoding=utf8"
    jdbc_user => "root"
    jdbc_password => "************"
    schedule => "* * * * *"
    statement => "SELECT id, pid, ...... FROM flowlogs WHERE id > :sql_last_value order by id" 
    use_column_value => true
    tracking_column => "id"
	last_run_metadata_path => "C:\logstash-8.6.1\config\lastValueOfMySqlForLogs.yml"
  }
}

output {
  elasticsearch {
    hosts => ["https://localhost:9200"]
    index => "myLogs"
    user => "elastic"
    password => "*********"
    cacert => "C:\elasticsearch-8.6.1\config\certs\http_ca.crt"
  }

stdout{}

}

Could you please let me know if I misunderstood something in the behaviour of logstash with the indexes ? Is it even possible to achieve what I'm trying to do with IDs ? (couldn't work with dates unfortunately)

Thank you very much.

Hello,

Nevermind, I think I solved my issue by modifying the setup.
I'm no longer based on the id but on the timestamps eventually.

I'm using a date from my DB (moment) that is the date at which I'm inserting the record in the DB. Then, I let logstash set the sql_last_value with the last query execution date.

I just had to handle the timezones in this configuration.
So far it looks like it solved both my issues, so I'll close this topic.

Here is the new setup in case it would help anyone.

input {
  jdbc {
    codec => plain { charset=>"UTF-8" }
    clean_run => false
    jdbc_driver_library => "C:\com.mysql.jdbc_5.1.5.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/talend_logs?characterEncoding=utf8"
    jdbc_user => "root"
    jdbc_password => "**********"
	jdbc_default_timezone => "Europe/Luxembourg[dst_enabled_on_overlap:true]"
    schedule => "* * * * *"
    statement => "SELECT id, pid, ... FROM flowlogs WHERE moment > :sql_last_value order by moment ASC" 
	record_last_run => true
	plugin_timezone => "local"
  }
}

output {
  elasticsearch {
    hosts => ["https://localhost:9200"]
    index => "talend-logs-etl"
    user => "elastic"
    password => "***********"
    cacert => "C:\elasticsearch-8.6.1\config\certs\http_ca.crt"
  }

stdout{}

}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.