Sync MySQL to Elasticsearch

adrianolimit · September 16, 2016, 12:10pm

Hello everyone,

In order to synchronise my MySQL database with Elasticsearch I set up the following process :

I activated the binary log in MySQL in order to see the different transactions
After that, I execute Logstash which read the different transactions that are stored in a file. The file that Logstash read looks like this :

INSERT INTO `mytable` (`field1`, `field2`, `field3`) VALUES (1, 'first value', 'second value')
UPDATE `mytable` SET `field1` = 'first value modified', `field2` = 'second value modified' WHERE `mytable`.`id` = 1
DELETE FROM `mytable` WHERE `mytable`.`id` = 1

I used the grok filter to check if the transaction is "INSERT", "UPDATE" or "DELETE" and after that I try to apply another filter to get the field's names and their values.

The problem is that the structure often changes, for example I could have :

UPDATE `mytable` SET `field1` = 'first value modified', `field2` = 'second value modified' WHERE `mytable`.`id` = 1
UPDATE `mytable` SET `field1` = 'first value modified'  WHERE `mytable`.`id` = 1

Here, you can notice that in the first update there is two fields that are updated and in the second line only one.

Do you know how can I solve my issue ? Do you think it's a good idea to process like this or do you have another suggestions?

Thanks in advance for your help.

magnusbaeck · September 16, 2016, 12:44pm

Any particular reason you're not using the jdbc input plugin?

adrianolimit · September 16, 2016, 1:12pm

@magnusbaeck I don't use the jdbc input plugin because (if I well understand), it will reindex all my data. For example if I have 2 millions of lines in my table, I have to do : SELECT * FROM .... in order to see the DELETE operations for example.

magnusbaeck · September 16, 2016, 1:40pm

Is there a way to tell old data from new data? Like an auto-incrementing id field (if rows aren't updated) or a timestamp that changes when a row is updated.

adrianolimit · September 16, 2016, 9:53pm

I'm not sure that is possible to know which data are been deleted because the "id" field is not an auto-increment but a specific id like this "d44fdkd_s5sd4".

Do you have any idea how can I sync my database with my elasticsearch?

magnusbaeck · September 17, 2016, 5:38pm

Okay, you also have deletions. That clearly makes things more difficult. I guess I don't have any great suggestions then.

adrianolimit · September 18, 2016, 11:25am

Ok ,thank you @magnusbaeck for your suggestions, I will try with these options :

jdbc input plugin
trigger in the database in order to execute command to Elasticsearch for each transaction
execute transaction in the database and in Elasticsearch at the same time.

Topic		Replies	Views
Mysql to Elasticsearch Sync with Logstash & input JDBC plugin Logstash	5	1951	January 10, 2019
Updated data from jdbc(mysql) to elasticsearch by using logstash Logstash	2	1836	April 14, 2018
MySQL Syncing with Logstash to Elasticsearch Logstash	2	987	December 26, 2017
Logstash sync mysql two tables to elasticsearch doesn't work Logstash	1	410	January 16, 2020
Jdbc input plugin and sql_last_value Logstash	6	1678	September 25, 2018

Sync MySQL to Elasticsearch

Related topics