Sync Google Cloud SQL to Elasticsearch

Hi all!

I'm planning build a backend on Google App Engine, using the Go lang.
I would to use the Google Cloud SQL as my primary DB (a MySQL DB fully managed), and sync some tables to Elasticsearch to take advantages that only Elasticsearch has.

Which is the best way to do that?

PS.: The unique solution that I see is: To embed the Elasticsearch REST Client in DAO of the primary DB, so when an operation is performed on my primary DB, it would be replicated on Elasticsearch. But it's is a little tricky.

What do you mean by "sync some tables"? There are fundamental challenges.

MySQL DB has columnar data in relational tables, where rows can be removed/updated/inserted very efficiently, while Elasticsearch has JSON documents, which must be denormalized, and massive updates/deletes do not perform well.

If you just want to copy table rows in a naive way by primary keys to ES JSON doc IDs from time to time, you can use my JDBC importer https://github.com/jprante/elasticsearch-jdbc/ which basically performs SQL selects. But it depends on the application data model if that is feasible or not.

First, thank you for your reply!

I said "sync some tables" because I don't want sync tables like user logins tables. Otherwise, no-private data must be synced.

I would like to understand that:

Imagine that I have a single row in my Countries table, representing the USA.
Imagine that I have 50 rows in my States table, each representing one state of USA, with a FK linked to USA row (in Countries tables).

Using JDBC Importer, I presume that, if I sync it, all States goes to Elasticsearch like: _source: {"Name": "Florida", "Country": {"Name": "USA"} }. Correct?

What happens if I update the name of a country from USA to United States in my MySQL DB? The documents inside Elasticsearch with "Country": "USA" will be updated to "United States", or not?

Thank you!

It depends on your SQL statement for JDBC importer.

If you update tables in MySQL, it will be picked up next SQL statement execution, but only if your SQL statement allows for that. For example, you can fetch whole table each run (recommended for deletes), or you can use a timestamp since last SQL statement execution. JDBC importer can use a cron-like scheduling.

Thank you very much! I appreciate that!