I am using JDBC input plugin, i have 10,000 records in Database
I am setting jdbc_page_size as 500
It goes in loop 20, to get all the 10,000 records
for this, i schedule this as
schedule => */2 * * * *
With this schedule, every 2 min it will fetch 500 records from database
But once the 20 loop is completed, the LS is again going and bringing data from Database
so the pipeline is continuously bringing the data, how can we avoid to stop bringing duplicate data once 10,000 records are completed?
The data is being logged to the elasticsearch?
If yes you could calculate MD5 hash out of your result and use it as a document ID. All the duplicates will be recorded in the Elasticsearch in the same document ID, and the only _version will increase.
This is to just avoid duplicate
i do not want pipeline to run unnecessary and keep doing indexing and then consuming lot of RAM and CPU by checking documentid. One it loads 10,000 records it should not load data at all
What about your
in the elasticsearch output settings ?
If you don't specify it, each time the request is launched ,it creates new documents.
Give it the unique id you have in the database, it should solve your issue.
I have 2mn records, i am using persistent queue
this is what happening with me
page0 : 0.5 mn
page 1: 0.5 mn (1mn)
page2 : 0.5 mn (1.5 mn)
page3 : 0.5 mn (2mn)
page 4: 0.5 mn ( 2.5 mn)
page 5 :0.5mn (3.0 mn)
This is going in continuous loop, how can i avoid that. I dont have any problem in using document_id, i am already aware of that solution, but how can i avoid this continuous loop?
I want once 2mn is done it should stop going in second loop and i should be able to schedule it for next day
But schedule here works on pagination basis
I'm sorry I misunderstood your issue.
Sadly, I can't help you as I'm still learning the Elastic Stack.
I hope you will find what you're looking for.
Hi , anyone have any idea, how we can solve this
Thanks in advance.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.