Incremental sync for Postgres and MySQL connectors

andrej.peplinski · August 8, 2024, 1:19pm

Are there any plans to support incremental syncs for Postgres and MySQL database connectors?

Sean_Story · August 8, 2024, 1:24pm

Both currently support the "naive" incremental sync, where we avoid sending data to Elasticsearch that we're sure hasn't changed since it was last ingested. What makes this naive though is that we still fetch this data from the source system.

We don't currently have a roadmap item to be smarter about how we fetch data from Postgres or MySQL for incremental syncs. But if you have a support relationship with Elastic, you can absolutely ask your contact to file an Enhancement Request on your behalf. If that's not an option for you, I'd be happy to put you in contact with one of our product managers, if you'd like to make a case for adding that feature.

Alternatively, our code is open, and we very much appreciate community pull requests.

andrej.peplinski · August 8, 2024, 1:50pm

Thanks for the quick response @Sean_Story. I will need to evaluate on how performance critical incremental syncs are for us and then potentially come back to you.
But I have one more follow-up question: What is the criteria for being "sure" that the data hasn't changed? Are the time stamps being used (as mentioned here). And if so, what is the expected time stamp field name?

Sean_Story · August 8, 2024, 7:23pm

For our database connectors, this is currently very unsophisticated, and we're using the table's last change date (or if it's a join, the most recent change time of any of the joined tables). So if anything has changed in the table, we're pulling all of it. See this code.

I do imagine that if we implemented a non-naive incremental sync feature for our database connectors, we'd require that you specify a timestamp field that indicated when the row was last changed.

You also may be interested to read: Elastic Connectors: Performance impact of incremental syncs — Search Labs

Topic		Replies	Views
Data synchronization between databases Elasticsearch	2	323	May 1, 2020
How to I migrate my existing postgres data to elasticsearch and also keep it in sync? Elasticsearch	10	5106	November 26, 2018
MySQL Syncing with Logstash to Elasticsearch Logstash	2	987	December 26, 2017
Jdbc input plugin and sql_last_value Logstash	6	1678	September 25, 2018
Jdbc input plugin full and incremental updates Logstash	3	444	October 25, 2019

Incremental sync for Postgres and MySQL connectors

Related topics