Problem:
I am using ELK 6.71. I am able to get logs data to Elastic Search by input plugin from MySQL database and i scheduled to execute the SQL query for every 1 minute. But, every time logstash runs the query it also fetches the previously fetched documents(duplicates).
Is there any way to fetch only the new entries every time i run the SQL query. Every document contains a unique log number field.
what is your output section looks like?
use
document_id => "%{logid}"
and you won't have duplicate
But problem here I see is that you are doing full query to table_name every second which is not good for large database.
and by doing this your elasticsearch is also working over time
for example you have
logid name as database filed. you have 10 record.
you read them first time and elasticserach will insert in to it's database
now lets say after two second you have 10 more record in your mysql database
elk will read that 20 record.
remove 10 from first one and insert 20 in.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.