I am using logstash to call an API (input), save it as .csv (filter) and upload it into a mysql table (output). All is good.
But I need to do it each day (or each night rather): call that API, get the .csv and upload it in mysql.
For uploading I am using a simple:
statement => INSERT INTO ... VALUES.
This will duplicate the existing entries.
The question is: which is your advise, the best practice, to only upload the delta, the differences: so what is new in .csv file, but does not already exists in mysql.
I don't have primary keys on mysql, probably I need to create one from 3 columns, there is no id, no unique identifier, rows values can appear multiple time. Example of the data from csv\mysql:
ColumnA ColumnB ColumnC ColumnD ColumnE
ValueA ValueB ValueC ValueD ValueE
ValueA ValueB ValueC1 ValueD1 ValueE
ValueA ValueB ValueC ValueD1 ValueE1
ValueA ValueB1 ValueC2 ValueD2 ValueE2