Hi,
I'm uploading data using jdbc-logstash script but when i run ths script it adds allover data again and again so index is showing count of duplicate data too.
This is the script which i am using
input {
jdbc {
jdbc_driver_library => "/usr/share/java/mysql-connector-java-8.2.0.jar"
jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/app_medixcel_base"
jdbc_user => "root"
jdbc_password => "plus91"
use_column_value => true
tracking_column => "leave_id"
tracking_column_type => "numeric"
statement => "SELECT leave_id, staff_id, start_date, end_date, end_time, leave_reason, is_full_day, approved, added_on, approval_reason, approved_by,
clinic_id, leave_in_days FROM app_medixcel_base.mxcel_staff_leaves WHERE start_date BETWEEN '2023-01-27' AND '2024-03-27';"
}
}
filter {
mutate {
copy => { "id" => "[@metadata][_id]"}
remove_field => ["@version"]
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
user => "elastic"
password => "plus91"
index => "index_mxcel_patient_leaves"
}
stdout { codec => "rubydebug" }
}
My tracking coloumn is leave_id
this script is running successfully and not giving any error how I can avoid the duplicate data upload?
I forgot to mention, you don't need fingerprint since you have leave_id as PK. However there is the code for fingerprint to make the unique id and you just set as document_id.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.