I am using logstash to retrieve data from SQL into Elastic Search.
I run query data on logstash and index is created and able to visualize data in Kibana.
My concern is how to stream data always from SQL to logstash , whenever record is updated into SQL then that record should be updated into elasticsearch using logstash.
The Logstash JDBC plugin retrieves data through queries and supports the use of a tracking column, which you can use to fetch only new or updated records assuming you can write a suitable query that uses this. If your data e.g. has an updated timestamp, you can use this tracking column to only select records with timestamps higher than when the query last ran, selecting only new or updated data.
thanks for your response. It works as shown below for id>1
i have 3 records with id =1,2,3 in database
So for every 1 minute the files are continuously getting added into elasticsearch inadditional like
for first 1min = 2 hits with id = 2,3 (total hits:2)
for next 1min = 4 hits with id = 2,3,2,3(total hits:4)
for next 1min = 6 hits with id = 2,3,2,3,2,3(total hits:6)
Here, ES is getting filled up with duplicate records. but i need to check for any updates or add new records in database then it must reflect in elasticsearch
Thanks a lot it worked fine now.
With this, we can insert any new rows inserted into database to the elasticsearch.
Can you please help me to know,
What is the scenario if we update the values from the existing rows, which must be updated in the ES aswell?
Ex:-
Table Name: Sample
columns(id,Name)
for id =1, we have name = 'Ram', in DB and in ES
suppose in database, i have updated name = 'Ramu' for id=1. How to reflect the change to ES.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.