Logstash performance and jdbc_streaming

Roberto_B · October 29, 2021, 2:00pm

Hi all,

i'm using in this days Logstash (I love logstash if I could I would use it to make coffee too ), like an ETL get the data from mysql transform it enriching the data with multiple jdbc_streaming [the same database (in some cases I manipulates the data)] and put it somewhere.

I have a single logstash instance with 4 core and 16GB RAM I use 8 for Xms and Xmg . In order to move 18Milion rows (pagination 500'000)logstash take 24 hours, i run the config with 4 workers and 1000 batch.size .

In another use case I moved 27Milion rows (pagination 100'000) of another table with similar filters using the same options (workers and batch.size) and the same instance in this case logstash spent 12 hours in order to complete the moving.

On source db the workload is fine. I don't understand why the behavior is different. Is smaller pagination preferred? I've to use persistent queues?
Any suggestion?

A dirty solution could be splitting the resultset in 2 and run the 2 pipelines at the same time, but i think there would be a better solution than this.

KR

Roberto

system · November 26, 2021, 2:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Mysql JDBC Extremely Slow Logstash	1	1127	July 6, 2017
How logstash is working? Logstash	3	2442	October 25, 2018
Logstash cannot handle huge amount of data Logstash	6	1191	February 4, 2019
Scaling Logstash Logstash	10	1513	July 20, 2017
How to do batched fetching with jdbc_streaming? Logstash	2	21	September 29, 2024

Logstash performance and jdbc_streaming

Related topics