I'm quite unclear of what sql_last_value does when I give my statement as such:
statement => "SELECT * from mytable where id > :sql_last_value"
I can slightly understand the reason behind using it, where it doesn't browse through the whole db table in order to update fields instead it only updates the records which were added newly. Correct me if I'm wrong.
So what I'm trying to do is, creating the index using logstash as such:
Simplify things by removing the elasticsearch configuration for now. Use a stdout { codec => rubydebug } output. What is the plugin doing? Crank up Logstash's loglevel to find out more.
@magnusbaeck Yes I did what you asked me to do. Inserted the codec and checked with the debug mode as well.
Part of the output:
[2016-11-02T16:52:00,276][INFO ][logstash.inputs.jdbc ] (0.002000s) SELECT count() AS count FROM (SELECT * from TEST where id > '2016-11-02 11:21:00') AS t1 LIMIT 1
[2016-11-02T16:52:00,279][DEBUG][logstash.inputs.jdbc ] Executing JDBC query {:statement=>"SELECT * from TEST where id > :sql_last_value", :parameters=>{:sql_last_value=>2016-11-02 11:21:00 UTC}, :count=>0}
[2016-11-02T16:52:00,287][INFO ][logstash.inputs.jdbc ] (0.003000s) SELECT count() AS count FROM (SELECT * from TEST where id > '2016-11-02 11:21:00') AS t1 LIMIT 1
[2016-11-02T16:52:00,582][DEBUG][logstash.pipeline ] Pushing flush onto pipeline
What should I be checking on with the output ? I'm cracking my head with this.
Can't I use an id of a table in order to update the index with the new records which are added? I tried it with the date and datetime field and it works perfectly fine. But then I have to work around with the id.
I suspect you've run the plugin at least once before setting tracking_column, so Logstash has saved a timestamp in ~/.logstash_jdbc_last_run but never updates that file since it never gets any results for the resulting query. Try deleting the file or changing it to the id where you want to start.
@magnusbaeck I deleted the .logstash_jdbc_last_run file and changed it to a value as 0 so that it could pickup for the id field changes from the database records. But still no use.
When I ran the logstash conf, the .logstash_jdbc_last_run is somehow having a timestamp value. I can't even imagine where is it picking up a timestamp from even after deleting it or even changing it to zero.
Is there any jdbc property, which I'm missing above?
Make sure you delete the .logstash_jdbc_last_run file before you run your logstash conf and have a different last_run_metadata_path if you're running multiple logstash instances.
@hearvishwas, please consult the documentation for your version of Logstash (the option might not be available until in later releases) and start a new thread if you have follow-up questions.
Hi @magnusbaeck Magnus Bäck
I am facing same issue which @Kulasangar_Gowrisang Kulasangar Gowrisangar is.
I am able to run logstash with jdbc when I remove tracking_column_type => "numeric", but this time logstash is reading old records, so ES is getting filled up with duplicate records. By the way its not tracking_column_type, its just "type"
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.