Logstash(with input-jdbc) always sync ALL data into elasticsearch even the condition is false


(Eddie) #1

Hi,everyone
I had some trouble synchronizing data with logstash(with input-jdbc) between MySQL and elasticsearch.When data start sync,i can see it will keep rolling over and over again,even the condition is false.
I made the full sync first,when i made the incremental sync,it happens.

And i set muiltple-table sync,one configure file per table.I have to say that not all of these tables will happen,only some of them.

And here's one of my configuration

comment.conf:

input {
    stdin {
    }
    jdbc {
      jdbc_connection_string => "jdbc:mysql://172.21.0.24:3306/mena"
      jdbc_user => "u"
      jdbc_password => "p"
      jdbc_driver_library => "/application/logstash-5.3.0/mysql-connector-java-5.1.40.jar"
      jdbc_driver_class => "com.mysql.jdbc.Driver"
      jdbc_paging_enabled => "true"
      jdbc_page_size => "50000"
      statement_filepath => "/application/logstash-5.3.0/mysql/comment.sql"
      last_run_metadata_path => "/application/logstash-5.3.0/conf.d/.logstash_jdbc_last_run-comment"
      schedule => "* * * * *"
      type => "comment"
    }
}

output {
    elasticsearch {
        hosts => ["172.21.0.26:9280","172.21.0.27:9280"]
        index => "mena"
        document_type => "comment"
        document_id => "%{id}"
    }
    stdout {
        codec => json_lines
    }
}

comment.sql:
select * from comment where es_timestamp >= :sql_last_value

BTW,the type of column "es_timestamp":
es_timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

Did I do sth wrong? or i haven't made myself clear yet,please just tell me.
Thanks to you guys.


(Eddie) #2

Please take a look at this problem, if there has anything wrong,please just tell me
Thanks


(Guy Boertje) #3

Set this option record_last_run to true. See https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html#plugins-inputs-jdbc-record_last_run. If you do not set this LS will use the UNIX epoch start time as the sql_last_value each time.

Then monitor the contents of "/application/logstash-5.3.0/conf.d/.logstash_jdbc_last_run-comment". You should see the time of the entry in the above file change in one minute intervals.

Background:
When LS first starts it will use the UNIX epoch start as the first value for sql_last_value, after the first set of records is processed LS should set and save the sql_last_value to the wall-clock time that the last run finished then it will sleep until the next minute and begin again but this time it should substitute the :sql_last_value in the query with something like TIMESTAMP '2017-05-15 10:11:12.123456', you should check the query in the db server logs. So the first run query should have `TIMESTAMP '1970-01-01 00:00:00.000000' and the next would have TIMESTAMP '2017-05-15 10:11:12.123456'

Please report back if you do not see this behaviour.


(Eddie) #4

Sorry for the delay.
Thank you so much for your reply.

I have solved this last week, Indeed,it's sth wrong with timestamp.
In my case,the different is that the default timezone of Logstash is UTC, which is different from my district, so the ":sql_last_value" is always smaller than "es_timestamp".

I just changed the SQL file to this:

select * from channel where es_timestamp > date_add(:sql_last_value,interval 8 hour)

Thank you!


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.