is it possible to run the jdbc_page_size execution in parallel ?
say if the count is 6K and jdbc_page_size is 1K, can we configure log stash to execute 6 parallel pagination ?
also if it get 100K , can we have max parallel execution , say like 10
actually my ask is without configuring multiple jdbc input plugin ?
No, this can't be done.
One reason for this is because we track the sql_last_value - this is a value taken from the results. For each row read it is updated and so its eventual value will be the value of the tracking column for the last row.
The jdbc input has a few different modes of operation and each one is selected via a combination of various settings, we don't do a good job at separating these into distinct code paths, they are quite intermingled. This make adding a parallel execution feature very difficult without refactoring the code completely.
It is of little help right now but we have plans to move the code to be a Java plugin that will share a lot of common code with the jdbc_streaming and jdbc_static filters. No ETA on when though.
Thanks for your comments, I tried to solve in other way like below
- instead of jdbc_page_size, i used only jdbc_fetch_size
- instead of one jdbc input plugin , i added ten , each sql execution based on the last digit of my primary key RIGHT(ST_ID,1)=0 , RIGHT(ST_ID,1)=1 , RIGHT(ST_ID,1)=2 , etc
- i maintain 10 different sql_last_value
this approch works for me, do you see any impact , since i am new to logstash.
Your solution is fine. I've seen many uses solve it in a similar fashion.
If the PK is numeric and alway auto-generated then a modulus function would also work for any number of jdbc input plugins.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.