Delete elasticsearch document with logstash-jdbc-input

Insert and update work with below config file
but elasticseach has no response as i excute delete-sql on mysql

input {
jdbc {
jdbc_driver_library => "/opt/logstash-2.2.2/mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://192.168.1.1:3306/databasename"
jdbc_user => "username"
jdbc_password => "password"
schedule => "* * * * *"
statement => "SELECT * FROM tablename
where tablename.update_time > :sql_last_value"
use_column_value => true
tracking_column => update_time

What does this mean?

All you have shown us is an input, if there is an output that does something in ES it would be useful to see it.

output{
elasticsearch {
hosts => ["192.168.1.1:9200"]
index => "index_tablename"
document_id => "%{id}" }
}

With the option : document_id => "%{id}"
The existing entries in Elasticsearch with the same ID will be overwrite when i update data on mysql.
but when i delete data form mysql, those ES entries won't be removed.
How can i remove those old data in ES which have been deleted on mysql.

1 Like

You can't it's only an update/insert process.
I'm not sure how you can get around this.

Thanks

I also have this problem, But I don't know how to resolve it?
Any help would be sincerely appreciate!

You can add a "status" column, and flag the entry as deleted, instead of deleting the entry.

1 Like

thank you!
I have one question about what you said above:
when I use sql like above
"where tablename.update_time > :sql_last_value"

The below bug appeard:
Exception when executing JDBC query {:exception=>#<Sequel::DatabaseError: Java::ComMysqlJdbcExceptionsJdbc4::MySQLSyntaxErrorException: Unknown column 'cc.update_time' in 'where clause'>, :level=>:warn}
Java::ComMysqlJdbcExceptionsJdbc4::MySQLSyntaxErrorException: Unknown column 'cc.update_time' in 'where clause': SELECT count(*) AS count FROM (select
*
cc is my table name, update_time is not exist, can you help me?
thanks !

First, you know, In mysql we just use delete from * to delete some date we don't use.
Second, We want to sync the delete information, as you say, "add a "status" column", what you say is :slight_smile:

  1. in mysql, just add a status column to indetify the date is ok or not, just like:
    mysql> select * from cc;
    +----+--------------------+--------+
    | id | name | status |
    +----+--------------------+--------+
    | 1 | laoyang360 | ok |
    | 4 | intok | ok |
    | 5 | jdbc_test_update08 | ok |
    | 7 | test7 | ok |
    | 8 | test008 | ok |
    | 9 | test9 | ok |
    | 10 | test10 | ok |
    +----+--------------------+--------+
    7 rows in set (0.00 sec)

mysql>
mysql>
mysql> update cc set status = 'deleted' where id = 10;
Query OK, 1 row affected (0.02 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> select * from cc;
+----+--------------------+---------+
| id | name | status |
+----+--------------------+---------+
| 1 | laoyang360 | ok |
| 4 | intok | ok |
| 5 | jdbc_test_update08 | ok |
| 7 | test7 | ok |
| 8 | test008 | ok |
| 9 | test9 | ok |
| 10 | test10 | deleted |
+----+--------------------+---------+
7 rows in set (0.00 sec)

2.In elasticsearch, revise the data status, I think in elasticsearch revise is very difficult.

I want to listen your advise,

1 Like

you can add a timestamp column, as you modify the status the timestamp value will be updated, Using paramter tablename.timestamp > :sql_last_value", logstash will only update the entry which timestamp has been updated.

Thanks. I already added timestamp column named modified_at, just like:
mysql> select * from cc;
+----+--------------------+---------+---------------------+
| id | name | status | modified_at |
+----+--------------------+---------+---------------------+
| 10 | test10 | deleted | 0000-00-00 00:00:00 |
| 11 | test1111 | ok | 2016-06-23 04:10:00 |
| 12 | test012 | ok | 2016-06-23 04:21:56 |
+----+--------------------+---------+---------------------+

and the sql is :slight_smile:
[root@5b9dbaaa148a logstash_jdbc_test]# cat jdbc.sql
select
*
from
cc
where cc.modified_at > :sql_last_value

and the result is
{"id":11,"name":"test1111","status":"ok","modified_at":"2016-06-23T03:10:00.000Z","@version":"1","@timestamp":"2016-06-23T04:10:00.185Z","type":"cc_type"}
{"id":12,"name":"test012","status":"ok","modified_at":"2016-06-23T03:21:56.000Z","@version":"1","@timestamp":"2016-06-23T04:10:00.187Z","type":"cc_type"}

1 Like

Hi,

I get that you need to flag / log the delete operation and I get that done via triggers in mysql.
But how can I instruct the logstash-jdbc-input plugin to delete the given document ids ??

Seems a simple task for the plugin no?
Thanks
Chris

You can add a "status" column, and flag the entry as deleted, instead of deleting the entry.