How many records does jdbc input plugin can read at once?

Sreenivas1 · August 24, 2023, 10:08am

Hi Team,

We have to pull data of 70million records from oracle database in a week and load into elastic cluster of 3 nodes. So we have a scheduler in database side which will load 50k or 1million records in 6min to a temporary table . After every 6min it will truncate the table and load 50k or 1million new set of records from actual table.

Logstash jdbc plugin will issue an sql query (select * from temptable) every 3min against the temp table to read 50k or 1million records within the 6min otherwise temp table will be overridden with new set of data and hence leads to data loss.

In this case is there any way to track whether logstash is reading all the 50k or 1million records each time.
How can I increase read capacity of logstash. kindly suggest best configuration to read more data at once. please explain how the jdbc backend execute sql query and read data.

Here is my plugin configuration


          jdbc {
                jdbc_driver_library => "${jdbc_path}"
                jdbc_driver_class => "${oracle_driver}"
                jdbc_connection_string => "${jdbc_url}"
                jdbc_user => "${jdbc_username}"
                jdbc_password => "${jdbc_password}"
                statement => "${query}"
                schedule => "*/3 * * * *"
                jdbc_default_timezone => "UTC"
             }

Regards,
Sreenivas A

Badger · August 24, 2023, 2:43pm

Not that I know of.

To fetch a million records in 3 minutes means moving 5,555 records per second. That feels high for a single thread. Modifying jdbc_fetch_size is one tuning option.

If you have an indexed id column then you could run multiple inputs each fetching a subset of the data, but that requires detailed knowledge of your use case.

system · September 21, 2023, 2:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash cannot handle huge amount of data Logstash	6	1192	February 4, 2019
Jdbc input plugin read data multiple times from database Logstash	7	778	July 6, 2019
Logstash - using jdbc input plugin as input. And pipiline delay as 30 seconds and batch size as 1000. Still input plugin reads all data from database doesnt get impacted by pipeline configuration of size and delay Logstash	5	403	November 15, 2023
Logstash JDBC input plugins fetch data from mysql multiple times Logstash	1	315	August 1, 2019
Tune logstash JDBC input for huge datasets Logstash	5	980	May 16, 2023

How many records does jdbc input plugin can read at once?

Related topics