1 huge pipelines vs 2 medium ones

Ossenfeld · July 18, 2022, 3:28pm

Hello,

let's say I've got a cluster with 10 machines a 12 cores and I see data not being collected fast enough, so I need to raise my threads which is already at 12.
Is better to go with the same pipeline and raise the threads to 16 or even 24 or just build another pipeline which collects the same data, so I got 2 with 12 threads each?

Thanks in advance
Marcel

Badger · July 18, 2022, 4:17pm

This is not the kind of question that can easily be addressed in a forum like this. You need to identify what the bottleneck is and then address it. How to address it will depend on what the bottleneck is. Remember the bottleneck may not be in logstash, it could be in whatever logstash is reading from or writing to.

Ossenfeld · July 18, 2022, 7:56pm

Well, the bottleneck definitely is the Logstash as it's pulling data out of a redis db and then writing to another one. Both got enough ressources to handle the requests, so..

leandrojmp · July 18, 2022, 8:22pm

What's your redis input and output looks like? What changes did you tried already? Please share your redis input and output.

Ossenfeld · July 19, 2022, 5:19am

I've tried many different scenarios, ranging from a single pipeline with 12 - 36 workers vs. 2 - 3 pipelines with 12 workers each. As I'm using 10 GB of heap, I multiplied the inflight count by 2,5 (cause 2,5x the standard heap) and then set the batchsize, so I'm barely below that threshold.

leandrojmp · July 19, 2022, 12:28pm

As I said in the previous post, what changes did you specifically did in the redis input and output?

What is your redis input? Did you change the number of threads and batch_count in the redis input? Changing those settings may help a lot with the ingestion.

Changing pipeline.workers and pipeline.batch.size will impact basically the filter and output blocks, but will make little to no difference in the input, which seems to be your issue.

Please share your logstash configuration with your input and your logstash.yml and pipelines.yml.

system · August 16, 2022, 12:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Input-redis parameters versus pipeline parameters Logstash	1	1565	July 6, 2017
Pipeline workers Logstash	8	5689	January 10, 2017
Workers option logstash input Logstash	9	3146	July 6, 2017
Speed up when logstash is running Logstash	6	1173	April 27, 2017
Logstash reading rate is max 8-9k per second Logstash	5	7698	May 14, 2018

1 huge pipelines vs 2 medium ones

Related topics