Hello,
I am using ELK to import data from an external SQL database server (big data-millions of rows of data).
I want to delete those data in the SQL server that have been imported to Elasticsearch.
A prior solution was to create two pipelines:
- One would import all rows from a certain table in a SQL server.
- Another that would truncate all data in the table on that SQL server.
This solution would result to erase all data found in that table even though some rows that have not been imported to Elasticsearch.
My question is: Is there a way to identify which rows from that table have been imported to Elasticsearch so that I can modify my jdbc statement to delete those data?
input {
jdbc{
jdbc_driver_library => "/data/sqljdbc_6.0/enu/jre8/sqljdbc42.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://xx.xx.xx.xx:1433;instanceName=xxx;databasename=xxx"
jdbc_user => "Elastic"
jdbc_password => "password"
schedule => "* * * * *"
statement => "DELETE FROM TABLE_X WHERE **_condition_**"
}
}
Thanks in advance.
Ganessen